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Summary 



Interferometry has been widely used for astronomy in the last century at ra- 
dio wavelengths; in the last decades, it has gained an important place also in 
the medium and near infrared wavelengths range. There are many differences 
between the two fields, due especially to the constraints posed by different 
behaviour of noise sources, flux intensities, instrumental limits at different 
wavelengths. 

However, the potentiality of interferometry with respect to observation with 
a telescope, especially the higher angular resolution, has encouraged the as- 
tronomical community to concentrate big efforts on this subject, in terms of 
research, money and time. Nowadays, several interferometric arrays, working 
at infrared wavelengths, have been built all over the world, or are under con- 
struction. One of them is the VLTI, an ambitious project of the European 
Southern Observatory (ESO). Prom the beginning of this millennium, its first 
interferometric data in the near infrared have been recorded. 

The aim of studying ever fainter sources, with increasing angular resolution, 
requires a great accuracy in the control of the interferometric process. In par- 
ticular, if the correlation between the interfering beams is maintained high and 
stable, the integration time can be increased sensibly, still providing a mean- 
ingful integrated fiux. Otherwise, the phase information is lost, e.g. due to 
atmospheric and environmental disturbances. For these reasons, ESO decided 
to equip the VLTI with fringe trackers, i.e. instruments able to sense the rela- 
tive position of the interfering beams and to correct it to a nominal position. 

In this framework, the Astronomical Observatory of Torino, part of the Italian 
National Institute of Astrophysics (INAF-OATo), has been involved from the 
late nineties in the design and development of a first fringe tracker, FINITO 
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and then of a fringe sensor for the PRIMA instrument, i.e. the PRIMA FSU. 
Designing and building a fringe sensor is a challenging task, with great difficul- 
ties. One of them is how to extract information about the fringes parameters 
from raw data. This is the subject of this thesis. 

The two instruments differ for the opto-mechanical layout, for the choice be- 
tween to temporal vs. spatial modulation, and for the quantity and type of 
data available. From the point of view of simulation, the fundamental differ- 
ence is the model adopted for the interferogram pattern. For FINITO, it is 
very simple and based essentially on theoretical predictions. The algorithms 
for the optical path difference identification that we present in chap. [2] use 
extensively this model. They are able to work with good results in the central 
area of the coherence length, but their principal limit is that they need to 
process a normalized interferometric signal. 

In principle they can be modified in order to adapt to the inputs, but this 
leads to the necessity of changing the underlying model. 

In chapter [3] we describe a more detailed model that is still based on the previ- 
ous theoretical one, but it contains a number of parameters to be opportunely 
tuned to easily adapt to the current signal. The availability for the PRIMA 
FSU of a larger number of interferometric signal samples (twelve instead of 
two) allows the implementation of a weighted least squares fit of the measured 
data to the new model. The algorithm works well and fast, thanks to the use 
of tabulated functions for the reference signal template. 

Of course, this is true if one assumption we make is true, i.e. that the template 
model is consistent with the current interferometric measurement conditions. 
Every discrepancy between the real signal and the tabulated template gives 
an error on the fringe parameters estimation. 

Some of the model parameters can be assumed to remain stable or very-slowly 
changing during the life time of the instrument, or at most to require a check 
on the time scale of months. Other terms, indeed, needs to be properly de- 
termined before any observational night or even more often, such as source- 
dependent ones. For this reason, we implement a set of calibration procedures. 
Their primary goal is to estimate the value of the critical parameters of the 
model, such as the overall instrumental transmission and phase functions, the 
visibility and the magnitude of the source. These values can then be fed into 
the template of the least square algorithms for the fringe location. 
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We tested it with both simulated and laboratory data, and we were able to 
reconstruct very well the spectral features of the measured signal. There are 
still some discrepancies between the intensities, especially for channels with 
lower flux. We can suspect that there is some phenomenon we do not include 
or properly model. 

Such a doubt highlights the particular condition in which we are working. Our 
model is given by a deterministic equation, describing the optical power of two 
electromagnetic waves that interfere. In this sense, it is a correlation between 
the two waves. However, it does not describe what happens to the other term 
of the model, i.e. the noise. 

These considerations have lead to the idea of analyzing interferometric data 
using classical instruments of the Statistical Sciences, such as analysis in the 
time and in the frequency domains. We have used VLTI data. Their peculiar 
nature has required efforts to tailor the statistical methods and to understand 
their results. For a particular problem, that is, the impact of estimation and 
subtraction of a signal trend on the estimated spectral density function, we 
give a mathematical derivation of the bias on the spectral density in appendix 

El 

Since the treated signals are not stationary, we analyze their variability, mak- 
ing use of statistical tests, in order to have a significance, and with regression 
analysis tool, trying to get out as much information as possible. All this sta- 
tistical part is the subject of chapter HI 

This work has been developed in collaboration between the Astronomical Ob- 
servatory of Torino and the Department of Mathematics of the University of 
Torino. From my point of view, the collaboration between mathematicians and 
astronomers, the exchange of knowledge, the needs to find a common state- 
ment of the problem, the twofold interpretation of each results have been a 
challenging opportunity for improvement. 

Several of the results achieved in this framework deserve further investigation 
on a more complete set of experimental conditions, and many of the tools pro- 
posed could fruitfully be included in either on-line or off-line diagnostics and 
data analysis software for interferometric instruments. 
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This work can be divided in two parts. 

In the first part, the attention is focused on OPD and GD algorithms that 
we have proposed for FINITO (chapter [2]) and PRIMA FSU (chapter E]). We 
describe the interferometric model on which they are tailored and their theo- 
retical performances. For PRIMA, we also test the model reconstruction from 
laboratory data. 

These algorithms refiect, in their increasing complexity, the increasing knowl- 
edge we gain on the manipulation of interferometric data, especially on the 
model. It must be noted that fringe tracking present different data analy- 
sis problems with respect to visibility extraction and interpretation. Indeed 
there are few good estimators for fringe position and their properties depend 
on instrumental features. The analysis of this first part showed us that, even 
if algorithms have good performances, efforts are still needed to deepen the 
relation between source flux, noises and so on. 

The second part (chapter H]) is devoted to the identiflcation and test of possible 
statistical tools able to answer some of our questions. We started from this 
problem: is it possible to check the presence of a noise due to combination, 
and in the affermative case, how to estimate it? We use data from different 
instruments, more suited to our purposes. There are many other questions, 
that can be posed, and lot of work has still to be done. The last paragraph of 
chapter H] will point out some of these questions. 

In Appendix A we consider the power spectral density function (PSD) of beams 
characterized by a linear trend, evolving in time. We are interested on the effect 
on the frequency spectrum determined by removing an estimated linear trend. 
This problem, which arises while analyzing interferometric data in chapter HI 
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is discussed in the special case of a detrended process that results wide sense 
stationary process. 

Finally, appendix [B] collects all graphics that were not inserted in the corre- 
sponding sections to avoid a too heavy presentation. 



Chapter 1 



Interferometry: from theory to 
fringe tracking 

In this chapter, we present the apphcation of interferometry to astronomical 
observation: the historical development, the state of art, why it is useful and 
what are the goals. We describe the physical process, and how it can be 
modeled, and we justify the need for fringe tracking. Finally, we introduce the 
working environment of the thesis: the VLTI and its instruments. 

1.1 Introduction 

The sky and the stars have been the subject of enthusiastic research since the 
oldest records of human activity. In the last years, several missions have begun 
to scan the sky from the space, but ground observations are still the dominant 
means. 

When looking at the sky from the ground, one of the great limits to accuracy 
and resolution is certainly the atmospheric turbulence. To face this problem, 
in the last century one branch of the technical development was devoted to the 
improvement of large telescope performance, using adaptive optics to flatten 
the incoming wavefront. Another branch that is becoming important in the 
last decades is interferometry in the near infrared part of the electromagnetic 
spectrum, after the success achieved in the second half of the XXth century 
by radio interferometry. Its most appealing feature is its angular resolution, 
i.e. the minimum distance between stellar sources at which the instrument is 
able to recognize the sources as distinct. 

When observing with an array of telescopes in interferometric mode, the high- 
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est achievable angular resolution 9 is limited by the longest baseline^ i.e. the 
maximum separation between pairs of telescopes in the array: 



where A is the observing wavelength and B is the baseline. For a single tele- 
scope, the angular resolution is inversely proportional to the diameter of the 
collecting surface: 



where A is the observing wavelength, and D is the aperture diameter. 

Since the angular resolution is related to a ratio of lengths, it depends upon 

the order of magnitude of both terms. 

When working with radio wavelengths (from millimiters to meters) , to achieve 
a good angular resolution is necessary to have large baselines, up to kilometers, 
but the sensitivity is high. For short wavelengths (from a fraction to tenths 
of //m for optical and infrared observation) the angular resolution is accept- 
able also for a single aperture, but not comparable with that achievable with 
baselines of hundreds of meters. Moreover, having two collecting areas should 
increase the limiting sensitivity. 

Actually, bigger telescopes do not guarantee a better sensitivity, because the 
atmospheric turbulence degrades rapidly their performances. The coherence 
area ac, i.e. the area where the wavefront can be considered fiat, limits the 
angular resolution; it depends on the wavelength and the Fried parameter tq: 



The Fried parameter is a characteristic of the observing site and of the current 
observing conditions, and can be measured. For short wavelengths, the coher- 
ence area is small, and without correction of the wavefront, the big aperture 
is useless. 

These considerations drove both the development of telescopes of increasing 
aperture, with sophisticated procedures for wavefront flattening (active and 
adaptive optics) , and the construction of interferometers. Given that for tech- 
nological issues the biggest apertures now achievable are of orders of 10 m, 
interferometry has today an important place, and is the subject of a crucial 
research field. 




(1.1) 




(1.2) 



ac oc — rad. 

To 



(1.3) 
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As an example, let us consider a large telescope with aperture D = 10 m 
observing in the near infrared range (A = 2 fim). Its angular resolution will 
beP p. 36]: 

Otei = 1-22-^ = 2 ■ 10"^ rad = 0.05 arcsec (1.4) 

If we are using an interferometer with baseline B = 100 m, the angular reso- 
lution will be[ll p. 39]: 

Oint = ^ = 10"^ ™^ = 0-002 arcsec (1.5) 

To reach the same angular resolution with a radio wavelength, say A = 1 mm, 
we would need a baseline of: 

1 TTITTI 

0.002 arcsec > 5 = 5 ■ 10^ m (1.6) 

2x5 

that is, 5000 km! This is the best currently achievable by VLBI, i.e. radioint- 
erferometers using the whole Earth as observing baseline. 

There are different ways to produce interference images, and they will be briefly 
presented in par. II. 3[ All interferometers, however, share some common com- 
ponents: two or more telescopes, connected with the combination laboratory 
through a beam-transport system, a delay line, to compensate the optical path 
introduced by the observing geometry, a beam combiner and a detector. With 
every solution, however, the ambition of observing fainter and fainter sources 
imposes strong conditions on the instrument sensitivity and on the control of 
optical and instrumental variables. In particular, the optical path difference 
(hereafter, OPD) between the beams before the combination has a crucial role, 
because if it is maintained near zero, the integration time can be increased from 
a fraction of second to minutes, or hours, with a great benefits on the sensi- 
tivity. 

This has led to the conception and construction of dedicated interferometer' 
subsystems, the fringe sensors and fringe trackers, able to measure and correct 
the optical path difference between the beams at nanometer level. 



1.2 History of interferometry 

The real angular size of stellar objects, compared to the observed one, and 
the way to measure them, have been central questions for astronomers from 
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centuries. It was in 1801 tliat Tliomas Young isolated the light interference in 
laboratory experiments: "homogeneous light, at certain equal distances in the 
direction of its motion, is possessed of opposite qualities, capable of neutraliz- 
ing or destroying each other, and of extinguishing the light, where they happen 
to be united" (Young, 1804). It was William Herschel p] that first observed, in 
1805, that unfilled apertures allowed to obtain better angular resolution than 
the whole aperture, but we have to wait until 1835 for a theoretical explanation 
(Airy), and 1867 for the proposal of multiple apertures, with Fizeau^. The 
first practical results came with Stefane [1], at the Observatoire de Marseille in 
1874 and Hamy[5] at the Observatoire de Paris in 1893. Albert Michelson in 
1891 [U] measured the angular diameter of the Jupiter satellites with great pre- 
cision, and in 1921 the first stellar interferometer was mounted at the Mount 
Wilson Telescope in California (Michelson & Pease[6]). The technological chal- 
lenge involved was heavy: to limit mechanical instabilities, the baseline was of 
about fifteen meters; photometric evaluation on fringes pattern were made by 
human eye. 

After the second world war, the higher resolution offered by an interferometer 
made this technique to become a standard in radio astronomy, that knew a 
great development, thanks to the relaxed tolerances offered by macroscopic 
wavelengths. We recall the work of Hambury Brown and Tiss, that in 1956 
showed that photons coming from a common source are correlated, and this 
correlation survives the process of photoelectric emission on which detectors 
are based [7]. Baselines grew from meters to kilometers, imaging procedures 
through efficient sampling of sky regions became well established. A detailed 
exposition of a number of theoretical and practical aspects can be found in [S] . 

When photon-counting detectors became available in the seventies, and laser 
control of the optical path went to the micrometers level, allowing baselines 
to grow to tens of meters, the radio techniques could be adapted to short 
wavelengths: modern optical and infrared interferometry was born. 
Here we just mention, in chronological order, the pioneering work of Labeyrie, 
with the speckle interferometry, the phase tracking stellar interferometer from 
Shao & Staelin in 1977, the first fringes seen at 2.2 fim by Di Benedetto & Conti 
(1983), the first fully automated interferometer (the Mark III at Mount Wilson, 
Shao & Colavita, 1988), the introduction of single mode optical fibers (e.g., 
Coude du Foresto in 1992), the optical synthesis imaging from Cambridge, in 
1996 with Baldwin. References can be found, e.g., in pj, page 330]. 
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Figure [TTT] shows a list of available optical/lR ground facilities, with the number 
and size of apertures, the maximum baseline, and their state of development. 
This table has been taken from [1]. The most ambitious projects are the Keck 
Interferometer (KI) in Hawaii and the Very Large Telescope Interferometer 
(VLTI) at Cerro Paranal, in Chile. References describing these arrays can be 
found in literature; for the KI, see Colavita & Wizinowich[9j, for the VLTI, 
see Glindemann et al.fTO] 



Faeliity 


Operating 


bite 


No. of 


Element 


Maximum 


Operating 


Operating 


Acronym 


Instltutlon(s) 


Location 


Collecting 


Aperture 


Baseline 


Wavelength 


Status 








Elements 


(cm) 


(m) 


(microns) 




GI2T 


Obs. Cflte d'Aaur 


CaJern, FH 


2 


150 


70 


0.4-0.8 & >1.2 


since 1B85 


IS I 


UC Berks lay 


Mt. Wilson, US 


3 


165 


iO + 


10 


since 1990 


COAST 


Cambridge U 


Cambridge^ UK 


5 


40 


22 


0.4-0.95 & 2.2 


since 1901 


SUSI 


Sydney U 


Narrabrl, AU 


13 


14 


640 


0,4-0.66 


since 19&1 


IOTA 


CfA/U Masi 


Mt. Hopkins, US 


3 


4B 


38 


0.5-2.2 


since 1993 


NPOI 


USNO/NRL 


Anderson Mesa^, US 


6 


60 


4SS 


0,45-0.85 


since 1995 


PH 


JPL/Caltech 


Mt. Palomar. US 


2 


40 


110 


1,5-2,4 


since 1995 


MIRA-I 


NAO Japan 


Tokj^o, Japan 


2 


25 


4 


O.S 


since 19S& 


CHARA 


Georgia St. U 


Mt. Wilson, US 


6 


100 


350 


0.45-2.4 


since 1909 


Ki 


CARA 


Mauna Kea, US 


2(4) 


1,000(180) 


140 


2.2-10 


initial 2001 


VLTI 


ESO 


Cerro Paranal, Chile 


4(3) 


820(180) 


200 


0.45-12 


initial 2001 


LBT 


U Arizona., Italy, et al. 


Mt. Graham, US 


2 


S40 


23 


0.4-400 


initial 2005T 



Figure 1.1: List of available optical/infrared ground telescopes arrays, 
taken from [T]. 

The interferometry research is an open field: there are still unexplored issues, 
especially technological problems, highlighted by the analysis of the first avai- 
lable optical and IR interferometric data. An overview of technological matters 
and scientific goals of optical interferometry can be found in the review of Mon- 
nier, 2003 



Interferometry potentialities 

The potentialities of interferometry are of course dependent upon instrumental 
limitations, such as maximum baseline length for the angular resolution, num- 
ber of combination for a good sky coverage, fiux coupling between apertures 
for the limiting sensitivity. Data with these good properties could assure the 
validation of theoretical models. Some of the most appealing goals are, for ex- 
ample, the study of close binary systems, for a precise determination of stellar 
masses, precise measurements of stellar diameters and their changes, for pul- 
sational models, large surveys of sky portions at extreme magnitudes (toward 
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twenty!). We also mention the possibihty of using interferometry for differ- 
ential measures, thanks to instruments (such as PRIMA or AMBER at the 
VLTI) able to simultaneously observe in two different directions: for example, 
for validation of the range of atmospheric models. 



1.3 Principles of Interferometry 

To describe the interference process we can consider the classic experiment of 
Young. 

The light from a point source passes into two pinholes at a certain distance. 
The two resulting beams are then combined and imaged on a surface. Due to 
the wave nature of the light, the electromagnetic fields interfere alternatively 
constructively and destructively, depending on the difference of the optical 
path they have covered. The figure of interference shows black area alternate 
with bright ones. 

There are fundamentally two types of beam combination, requiring different 
mechanical and optical solution for the superposition of beams, but equivalent 
in ideal conditions: the image-plane and the pupil-plane interferometry. The 
main difference is where the beam combination takes place. Their properties 
make them best suited for intermediate resolution and for high resolution, re- 
spectively (see, e.g., [U ch. 3] and [H ch. 4]). 

In the image-plane method, each beam is focused on the image plane (a de- 
tector, for example), and the images are superposed. This is called Fizeau 
interferometer. 

In the pupil-plane method, the beams are superposed before being focused on 
the image plane, in a beam combiner, that can be a glass, or an optical fiber. 
The beams are then separated again and focused separately on the detector. 
This is called Michelson interferometer. 

We are interested in instruments based on the latter method. 

In the following paragraphs, we recall the fundamental principles of interfer- 
ometry, following the notation of [1]. 
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1.3.1 Monochromatic interference 

The wave nature of stellar beams, written as electromagnetic fields, gives a 
simple description of the physical process. 

Let (f){x,t) be an electromagnetic monochromatic wave at frequency u = j, 
where c is the light speed, traveling in the space in the direction n. We can 
write it as: 

t) ~ Ae*(^---2--t) = ^e*(fc-— (1.7) 

where 

kn 

27rz/ _ 27r 
^ ~ T 

271U (1.8) 



Now let us consider an idealized interferometer, as shown in figure 11.21 Two 
telescopes are given, separated by a basehne B, and they are both pointing to 
a distant point source located at a distance S from the center of the baseline. 
Thus, the pointing direction is given by the versor = n. The source is taken 
sufficiently distant so that the wavefront can be considered ffat. 

When the light beam traveling from the distant source arrives at telescope 1, 
at position xi, its equation will be, following eq. 11.71 

M^i,t) ~ Ae^(^^"^-'^*) = Ae-^('='"^"i+'^*) (1.9) 

and equivalently will be the beam at telescope 2 at position X2: 

where we have used the fact that the baseline B is given hj B = X2 — Xi. 
Comparing eqs. 11.91 and 11.101 we can notice that the only difference between 

^ We are neglecting here the velocity change of the beam due to the travel into a medium, 
as the atmosphere, which modifies the beam wavelength: 

c Xiy 
V = - = — = A„i/ 
n n 

where n is the refraction index of the medium. Moreover, we are neglecting the atmospheric 
turbulence too, which adds a random optical path to the beam. 



k 
k 



00 IS 



the angular frequencjE]. 
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Distant 
Source 




Figure 1.2: Idealized interferometer scheme. 



the two waves is given by e , a term that depends just on the geometry of 
the system including the interferometer (B) and the observed source (s). 

After the collection at the telescope level, the two beams 0i and 02 travel into 
the interferometer arms to reach a common point where they will interfere, 
covering a distance of di and d2, respectively, as shown in fig. 11.21 This 



1.3 Principles of Interferometry 
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additional optical path increments the total optical path covered by the beams 
from the stellar source: 

where we have neglected, without loss of generality, the common component 
j^^-iksx^^ The interference process adds the two waves: 

= + 02(t) ~ e-'^\e'^'^^ + e-'^'^+'^^^) (1.12) 

The detector is sensitive to the optical poweiil P of the resulting beam, defined 
as P = 0* ■ 0. We finally get the optical power over a unit time integration: 

P oc 0(t)*0(t) = e-*"*e^'^*(e^^°'i + ^-^^-^B+ikd^^i^^-ikd^ ^ ^iksB-ikd^-^ ^ 

-j^ _j_ ^~ikdi—iksB+ikd2 _j_ ^ikdi+iksB—ikd2 _j_ 

= 2[1 + cosk{sB + di - d2)] = 2[1 + coskD]. (1.13) 
where we have posed D = sB + di — d2. 

Looking at this equation, we can see that the optical power is given by an 
offset and a sinusoidal wave with frequency k = 2ti/\ over the spatial variable 
D, called the optical path difference, or OPD. It will have a crucial importance 
all over the work of this thesis. 



In an ideal but concrete example, if each telescope has a collecting area of 
A [m?], if the source emits at wavelength A a constant fiux F [photons ■ 
(time unit)~^ ■ m~^], in a time unit we get an optical power: 

P = 2AF[l + coskD] (1.14) 

A representation of this ideal function is shown in figure [T31 The optical power 
obscillates infinitely over the OPD variable. Each period is called interfe- 
rometric fringe. Two consequent fringes are separated by ^. This has a 

^The energy of a beam crossing a unitary area perpendicular to the propagation direction 
is proportional to the temporal average of the square of the electric field: 

1 — >+oo 11 J _J' 
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physical explanation, too. A difference As of the observation direction can be 
interpreted as an angle in the sky (in radians), so two different fringe peaks 
projected in the sky are separated by an angle given by [U, page 12]: 




(1.15) 



OPD [microns) 

Figure 1.3: Idealized interferometer scheme. 



1.3.2 Polychromatic source 

Let now allow the source to be polychromatic. We report the detailed compu- 
tation of the interferogram pattern in this ideal case because we will use it in 
the future work. 

We can consider the source flux as the harmonic composition of components 
at different frequencies, as photons do not interfere with each other. 
Let us suppose that the source flux is constant over a range of frequencies 
V G [z^i,t'2]: Fy = Fq. Also the interferometer will have a finite spectral 
response rji^u). The total optical power will be, modifying eq. 11.141 

P = 2 j AFyr]{u)[l + coskD]du (1.16) 

In the ideal case, the interferometer response is a perfect bandpass filter, i.e. 
a rectangle over the band [i/i,z/2], centered in z/q = ^^ii±2 g^j^^ with length 
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Ai/ = 1^2 — 1^1, with constant value rjo: 

j'Uo+Au/2 

P = 2A F^r]o[l + coskD]diy (1.17) 

Juo-Au/2 

Remembering that k = 27ri//c, the argument of the cosine becomes: 

27ru D , , 

kD = D = 2'KV— = 27VUT 1.18 

c c 

where r has the dimension of time. Substituting in eq. 11.171 we obtain: 

l'Uo+Au/2 

P = 2A F^rio[l + cos{27TUT)]du = 

Juo-A.u/2 

= 2AFQ7]o[Aiy + 2 )- ^- co s {27r t uq)] = 

2txt 

= 2AFor]oAu[l + ^ '-cos(27ituo)] 1.19 

We have now to consider again the relation between r and the OPD D of eq. 
[TTSl We obtain: 

TTTAu = IT — Az/ = TT — (z/2 — Ui) = TT — (- — ) = TT — C — - — ~ 

c c c A2 Ai c A1A2 

^ 7rD^ = 7r^ (1.20) 

Aq 

where we have used the relation z/A = c, where Aq = c/ i^o is the central wave- 
length of the wavelength range, and where Lc = 4^ is the coherence length, 
where the fringes are formed. In a similar manner, for the cosine argument we 
find: 

D c 2TrD , 
2nTUo = 2tt—— = — — = koD. (1.21) 
c Ao Ao 

Substituting into eq. 11.191 we finally get: 

2AFo7]oAi^[l + \ ^cos 27rrz/o = 

TlAuT 

. r 71" -D ,2ttD^, 

= 2AFor]oAiy[l + sinc—cos{^—)] (1.22) 

J^c Ao 

where the function sinc{x) = jg |;]2e 'sinus cardinalis', or cardinal sine. 

Looking at the last equation, we can recognize two different modulation pat- 
terns. One is the cardinal sine: it has a modulation frequency that depends 
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on the fiher range Ai/, here hidden in the coherence length. We remark that 
it comes from the Fourier transform of the rectangular bandpass; the variables 
V and A form a Fourier pair, with the normalization factor c. The other pat- 
tern is the cosine, modulated at the frequency fco? correspondent to the central 
wavelength Aq. Figure 11.41 shows a typical interferogram pattern: fringes are 
modulated over the optical path difference D, and the sine creates an enve- 
lope that smooths the fringes amplitude as far as D is far from the zero OPD 
(ZOPD), where we have the maximum of the envelope. 



polychramatic interferogram 




palychromatic interferegram 




DPD (microns) 



Figure 1.4: Polychromatic interferogram in H band. The dotted hne is the 
envelope, i.e. the sine function. The central wavelength is A = 1.65/im for 
both graphics, while the waveband changes from 0.16/im (left) to 0.30/ito 
(right), giving a different coherence length: 17.016/im and 9.075/iTO, respec- 
tively. 



Geometrically, if the OPD is zero the two beams overlap perfectly even at dif- 
ferent wavelengths. If the OPD is not adjusted to be zero, beams at different 
wavelengths will have their maximum coherence at different position, causing 
a decrease in the interference amplitude. Figure 11.41 can help visualizing this 
concept. 

If the interferometer bandpass filter was described by a different function, the 
Fourier Transform of this new filter would modulate the envelope pattern. 

1.3.3 The complex visibility 

Information on the observed source have to be extracted from the interferom- 
eter output. A fundamental Fourier transform relationship holds between the 
optical power measured by the interferometer and the source brightness func- 
tion. This relation is known as the van Cittert-Zernike theorem. To describe 
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intuitively the nature of this relation, let us come back to the monochromatic 
description, and let us introduce a spatial variable on the sky, say the solid 
angle fl, with reference to the vector of observing direction sq. The angle Q is 
moving on the star surface S. 



Let us assume that the instrument response is described by a function ri{so, a) 
and the flux intensity by F{so, a). As a moves on the star surface, it identifles 
a different region on the star. A portion of source surface with dimension 
dQ (see flg. II. 5p . related to the observing direction sq + a, sufliciently small 
to ensure that //(sq, a) and F{so, a) can be considered constant over dfl, will 
generate an optical power Pdn- 



If we further assume that all surface portions on the source add up incoherently, 
we can write the interferometer output as an integration, putting together eq. 




Figure 1.5: Area dfl on the source S. 



Pdn = r]{so,a)F{so,a)[l + cosk{B ■ (sq + a))]dQ 



(1.23) 



20 



Interferometry: from theory to fringe tracking 



[nS] with 03 as: 

P = 2 I r]{so,a)F{so,a)[l + cosk{B ■ {so + a))]dn = 
Js 

= 2 ■r]{so, a)F{so, d-)dil + 2 / ■r]{so,a)F{so,a)cosk{B ■ {so + a))dil 
Js Js 



= Pq + 2cos{kB ■ Sq) / ri{sQ, a)F{so, cr)cos{kB ■ a)d^l 

Js 

- 2sin{kB-so) / r]{so, a)F{so, a)sin{kB ■ a)dn = (1.24) 
Js 

= Pq + 2cos{kB ■ So)Re{V} + sin{kB ■ so)/m{1/} = Pq + Re{Ve'^^-'^} 



where the function V is the complex visibility of the brightness distribution F, 
referred to the phase reference Sq, and is defined by: 

V = V{k,B)= [ ri{so,a)F{so,a)e'''^-''dn (1.25) 
Js 

This relation is known as the van Cittert-Zernike theorem. The complex vis- 
ibility is function of the observing wavelength, through k, and of the baseline 
B. Remembering that ~ can be considered angles in the sky, noting as Bx 
and By the projection of the baseline over the ground coordinate system, we 
define spatial frequencies the coordinates {u, v) defined as: 

B^ B,,, 
«=^, v = ^ (1.26) 

With a good coverage of the sky in the {u, v) coordinates it is possible to 
invert the complex visibility to obtain a dirty brightness distribution, so called 
because it is biased by the sampling function of the sky, i.e. the (m, v) coverage. 
There is a huge effort in the domain of cleaning the brightness distribution and 
of reconstructing images from it. 

Visibility properties 

The complex visibility has several properties, directly derived from its defini- 
tion. Being A and F real, for the visibility holds the following relation: 

V{-u,-v) = V*{u,v) (1.27) 

If we add a delay a to the reference phase sq, we have seen in eq. 11.241 that 
this is equivalent to adding a delay in the phase domain: 

Po + Re{Ve'''^-''>} (1.28) 
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so the fringes are translated in the optical path difference space. 

It is a dimensional quantity. It is common to use a normalized visibility, 
first introduced by Michelson, that compares the intensities of the dark and 
the bright areas of the interferogram: 

P — P 

■rr ^ max ^ mm /, nr\\ 

= ^ — — u-^yj 

^max ~r J^min 

This is a non dimensional quantity, varying in the interval [0, 1]. It goes to 
zero when the intensity does not vary along the OPD {Pmax = Pmin), and is 1 
when Pmin = 0. It can be shown that IH page 20] 

Vm=\V\. (1.30) 

We can resume saying that the visibility is a complex quantity; its modulus 
is the fringe contrast, while the phase contains information about the shift of 
the fringes from the zero OPD, i.e. from the reference position. 



1.4 Fringe Tracking 

We have seen in the previous paragraph how it is possible to find a relation- 
ship between the brightness distribution of the observed source and the optical 
power recorded by an interferometer. Moreover, we have mentioned the po- 
tentiality of interferometry with respect to traditional monolithic telescopes. 
We have pointed out, however, that a delay in the observational direction 
introduces a delay in the phase domain. The same applies when a delay is 
introduced in one arm of the interferometer. In both cases, the result is an 
unbalance of the optical path of the two beams before the combination. 

We have mentioned in section 11.21 that a differential optical path between the 
beams impedes the fringes at different wavelengths to overlap perfectly: the 
interference is not maximal. The principal responsible of this phenomenon 
is the atmospheric turbulence, that forces the beams to do additional optical 
path before the collection at the telescopes, in a random way, different at all 
wavelengths. Also instruments can add an extra path, usually static or slowly 
varying. The consequences on the visibility depends from several factors: the 
estimator used for the measurements of the visibility itself, the kind of OPD (a 
simple shift, or a OPD with a velocity and an acceleration). Detailed studies 
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on the subject can be found in hterature (see [12] and references therein). 

The visibility degradation, added to the possibility of increasing the integra- 
tion time well beyond the coherence tim^ if the beams are maintained aligned, 
explains why the astronomic community judged necessary the introduction of 
dedicated instruments able to measure and correct the OPD in real time, equa- 
lizing the paths of the beams. These instruments are called the fringe sensors, 
used in the fringe tracking closed control loop. Their role is to follow the OPD 
very quickly, well beyond the atmospheric change rate, and to correct it. The 
challenge is great: the operations must be very agile and accurate. 

All big interferometers have been equipped with a fringe tracking system. 
The Keck Interferometer was equipped with FATCAT (see, e.g., Vasisht [T3] 
and references therein). The VLTI was first equipped with FINITO; the 
VLTI PRIMA instrument, dedicated to astrometry, has its own fringe sen- 
sor, PRIMA FSU. 

For the subject of this thesis, we will focus onto VLTI instrumentation and 
data reduction. 

1.4.1 Fringe Tracking at the VLTI 

The ESO VLTI (European Organisation for Astronomical Research in the 
Southern Hemisphere Very Large Telescope Interferometer, www.eso.org) has 
been designed to combine up to four Unit Telescope (UT), with apertures of 
8 meters, and several Auxiliary Telescopes (AT) of 1.8 meters diameter. Each 
UT can work as a stand-alone conventional telescope, but the array of tele- 
scopes can guarantee a maximum baseline of ~ 200 m for ATs and ~ 100 m for 
UTs. There is a first generation of instruments working in the near infrared, 
and dedicated mainly to the visibility measurement: VINCI was the commis- 
sioning instrument [H] [in] , responsible for testing the working performance of 
the VLTI, working in K band ([2.0 — 2.5]/im) from the beginning of this cen- 
tury; AMBERpLSj, the first attempt to combine up to three beams, working 
in J ([1.1 — 1.3] fim), H ([1.4 — 1.9] fim) or K band, whose fringes were first 
recorded in 2004, MIDI[T7], dedicated to thermal infrared (N band, lO/xm). 



•^The wavefront can be considered flat over a time interval, called coherence time, which 
depends on the observing wavelength, and it is characteristic of the site 
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All these instruments have a temporal resolution worse than the average co- 
herence time, so they can not do fringe tracking alone. 

The first prototype for a dedicated fringe tracker was developed by the Ob- 
servatoire de la Cote d'Azur (OCA, Nice). A description can be found in flB\, 
and the working principle will be resumed in chapter O From this prototype, 
the Astronomical Observatory of Torino, in collaboration with ESO, developed 
the first fringe sensor for the VLTI: FINITO (Fringe tracking Instrument of 
Nice and TOrino). It was constructed in Torino, and delivered to ESO in 
2004. During the commissioning the fringe tracking loop could not be closed, 
meaning that the instrument wasn't able to check the actual OPD value and 
to correct it. It was not an instrument fault, at the contrary it was useful to 
identify several problems of the overall VLTI system the delay lines had a 
residual alignment error, the Adaptive Optics (AO) left an internal turbulence, 
and the source fiux was subject to intensity fiuctuations due to saturations of 
the mirrors of AO, there were vibrations that induced a distortion of the mo- 
dulated fringe pattern. The fringe-tracking loop was finally closed in 2006, and 
now FINITO is routinely used in association with other scientific instruments, 
such as AMBER. 



In the same time frame, the OATo was involved in the implementation of a 
fringe sensor (FSU) facility for a second generation instrument, PRIMA pU] 
(Phase Referenced Imaging and Microarcsecond Astrometry). PRIMA aims 
not only at visibility estimation, i.e. the modulus of the complex visibility as 
we have seen before, but also to the phase of the complex visibility, in order 
to be able to reconstruct images from interferometric data. Moreover, it has 
two separate FSUs, to simultaneously track the scientific object, potentially a 
faint source, and the reference bright star. 



The difficulties encountered by FINITO, which suffered the lack of information 
on the environmental condition and on the received fiux features, forced to 
elaborate a more sophisticated interferometric model and to add a number of 
instrumental degrees of freedom in order to be able to properly calibrate each 
FSU. 



24 



Interferometry: from theory to fringe tracking 



1.4.2 Interferometric working condition at VLTI 

The atmospheric turbulence induces disturbance on OPD and intensity fluc- 
tuations. Less important, but not negligible are the perturbations caused by 
instruments and by the environmental conditions. These effects must be re- 
jected by the interferometric system in order to work properly. 
A description of the influence of the main subsystem of an interferometer and 
of the turbulence model can be found in [21]. Here we recall some of the 
principal aspects. 

Atmospheric turbulence 

Lot of researches have been carried on for the description of the atmospheric 
turbulence, and it still is an open fleld. For our purposes, we use the model 
developed for the ESO VLTI, described in p2] and based on the Kolmogorov 
model. 

Above a cut-off frequency z/^ = 0.22v/B, where v is the wind speed (meteoro- 
logical conditions) and B is the baseline (observing conflguration) , the power 
spectral density of the disturbance on the OPD can be approximated by a 
power-law formula: 

PSDoPoiu) = So-Xl- r-'" ■ v"' ■ z/-«/^ (1.31) 

where 5*0 can be measured and is equal to 0.0039 in standard VLT condition, 
Ao is the central wavelength and tq is the Fried parameter. Below the cut-off 
frequency, the spectrum can be simply approximated by so it is inde- 

pendent on the baseline and on the wavelength, and for very low frequencies 
the slope become positive. 

Instrumental issues 

The atmospheric coherence length depends on the wavelength, it ranges from 
few milliseconds in the visible range to few ten ms in the near infrared and few 
hundred ms in the thermal (medium) infrared. 

Since the collecting area of each single telescope is larger than the atmospheric 
coherence length, an adaptive optic system is required. Its role is to flatten 
the incoming wavefront, correcting low frequencies turbulence. It is essential, 
especially for large apertures. We can say that interferometric performances 
depend on those of AO. 

However, the AO correction is done on each single telescope, which is affected 
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by a different wavefront corrugation with respect to the other apertures, cau- 
sing a residual OPD to be introduced in the system. For basehnes longer than 
few meters it is necessary to track this OPD disturbance and to correct it. 

After the collection, the beams are sent into the delay lines (DL), that have 
a double role: first, they carry the beams from individual telescopes to the 
combination laboratory. Second, they correct the OPD sensed by the fringe 
sensor, thanks to the OPD controller, that receives information from the sensor 
and sends them to the delay lines, according to a specified control algorithm. 
In the VLTl, the DL are not evacuated, and this fact leaves a mismatch between 
the zero OPD, where interference at each wavelength reaches its maximum, 
and the overall group delaj0 caused by the wavelength range, (longitudinal 
dispersion) . 

The fluctuations caused by instruments (delay hues, residuals from AO, from 
the electronic boxes of other instruments before the fringe tracker and of the 
fringe tracker itself) add up to the atmospheric turbulence. A metrology sy- 
stem can be foreseen, at the FT level, to check the internal optical path, and 
its information can be used locally, to stabilize the internal path, or sent back 
to the OPD controller. 

At the end of the combination chain, a detector records the interferometric and, 
eventually, photometric intensities. Modern detectors can reach high recording 
rates with an acceptable noise level. Both FINITO and PRIMA FSU adopted 
integrating detectors: the integration time can be set by the user, together 
with the read-out mode. Thanks to the interposition of a dispersing prism, 
the PRIMA FSU detectors record also different spectral bands in contiguous 
pixels. After the data saving, the optical path difference can be evaluated, 
using a proper combination of the interferometer output. 

The role of the algorithms responsible of this evaluation is of course very im- 
portant. They must have good performances both in accuracy and in velocity. 
In the ideal working condition, the limiting sampling rate should be due to the 

^The group delay (GD) is a measure of the optical path difference that takes into account 
the dependence of the refractive index from the wavelength. Beams can travel through 
different paths in air, with different refraction indexes. See, for example, P. Lawson in [U 
p. 115] 
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finite photon fiux from sources rather than to instrumental performances. 



Chapter 2 



Location algorithms for the 
VLTI FINITO fringe tracker 

In this chapter, we will focus on real-time fringe tracking algorithms that are 
suitable in the framework of temporal modulation. Starting from a general 
approach due to J. W. Goodman ([23]), based on the Fourier Transform of 
the interferometric data, we will introduce a temporal version of it, used for 
a laboratory prototype of fringe tracker, the PFSU ([21]), and finally we will 
describe the location algorithms proposed by the Observatory of Torino for 
the VLTI ESO FINITO. In this last framework, I worked on the simulation for 
the adapted demodulation algorithm (par. 12.2.21) and on iterative techniques 
for the flux intensity monitoring task (par. 12. 3p . 

2.1 Classical algorithms for fringes location 

We have explained in chapter [T] that it is important to know as precisely as 
possible the position along the OPD scan of the fringe packet, in order to reach 
high sensitivity with longer observational intervals. This is possible thanks to 
fringe sensors, that evaluate the current OPD using dedicated algorithms, and 
send the information back to OPD correctors. 

The fringe sensor provides regular measurements of the fringes intensity. Al- 
gorithms require the knowledge of the essential parameters of the fringes, such 
as intensity, visibility, working wavelength and spectral range. These parame- 
ters are matched with those of an interferometric model, to finally obtain the 
desired differential phase between interfering beams. 
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There are at least two ways to physically record the fringe packet pattern, and 
they are known as 'spatial' and 'temporal' modulation. The former consists 
in the simultaneous recording on the detector of selected points on the fringe, 
separated by a known phase. The latter is implemented recording the inten- 
sity values throughout a controlled modulation, with known applied phase, of 
internal OPD. 

In both situation, the basic measurement scheme is the AC (two points for 
fringe, separated by a phase of vr rad, or equivalently by an OPD of A/2). 
However, the phase can be evaluated modulo vr, leaving a position uncertainty 
inside the fringe. The ABCD scheme (four points, separated each by 7r/2 rad) 
avoids this problem. 

In the ideal noiseless model, the performances of ABCD algorithm can be an- 
alytically estimated, and we report them in par. 12.1.11 These values are taken 
as reference even with other algorithms, for which the analytical evaluation is 
less straightforward. 

For the FINITO fringe tracking instrument, described in par. 12.21 and based 
on temporal modulation of the internal OPD, two algorithms have been pro- 
posed, apart a modified ABCD (par. I2.2.4p . The first is a classical demod- 
ulation scheme, adapted from the PFSU prototype algorithm, developed by 
the Observatoire de la Cote d'Azur and described in par. I2.1.3[ The latter is 
based on correlation with a template, and is resumed in par. 12.2.31 For both, 
we present the interferometric model and the assumptions on which they rely. 
The simulations are done with the IDL programming environment. Additional 
algorithms are required to solve the fringe uncertainty, i.e. to remove the pe- 
riodic degeneration within the modulated envelope. 



2.1.1 Ideal ABCD 

We resume here the classical fringe-tracking ABCD scheme, which is able, in 
ideal condition, to estimate the essential fringe parameters modulo 2tt. The 
ABCD sampling of the fringe is represented in figure 12.11 it consists of four 
points in quadrature over a single fringe of constant flux intensity F and visi- 
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bility V: 



A = F-{l + V-sm4. 

B = F-{l + V ■ sm(^ 

C = F-(l + y-sm(^ 

D = F- (1 + 1/ -8111(0 + 3^2)) 



7r/2)) = F-(l + l^-cos. 
n)) = F ■ {1-V -siiKj)) 



F- (1-1/ -cos, 



(2.1) 




Figure 2.1: ABCD scheme 



from which we can easily obtain, through trigonometric relations: 



A 



arctan ■ 



A-C 



B-D 



F 



-{A + B + C + D) 

{A-Cf + {B- Df 
F2 



(2.2) 



where A is the working wavelength. 



If we suppose that the only uncertainty on the A, B, C and D estimates is given 
by the photonic noise and by readout noise (see par. 12.2.21 for a description), 
whose variances can be approximated with the mean flux and a constant R, 
respectively, we obtain the following equations for the residual noise on the 
flux F, the visibility V and the OPD estimates: 



a^{A) = A + R] a\B) = B + R; (r\C) = C + R; a\D) = D + R 
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a{^) = l^^^ ^ a{OPD) ^ 



V-SNR ' ' 2 tiV ■ SN R 

a{F) = VF + R 

where the Signal to Noise Ratio (SNR) is, in this case, given simply by: 

2F 

SNR = , (2.4) 

2.1.2 Demodulation algorithm at low light level 

In 1973 Walkup and Goodman[2S] described the limitation of the fringe pa- 
rameters estimation at low light levels, both for the spatial and the temporal 
modulation. In this approach, with a good dispersion of the fringes over a 
sufficient number of pixels of a detecting system, it is possible to extrapolate 
information about phase and amplitude of the interferogram from the zero 
component of the Discrete Fourier Transform (DFT) of recorded data: 

^(^) = iv 5^ n(j)e-2-^^™/^, m = 0, . . . iV - 1 (2.5) 

i=o 

where is the number of pixels, n(j), j = . . . — 1 are the counts, X(m), 
m = 0, . . . iV — 1 is the DFT of the pixels counts. 

We recall the definition of the coherence length for a polychromatic interfero- 
gram over the wavelengths range [Ai, A2], given in chapter [T], as a function of 
the range width AA = A2 — Ai and of its central wavelength Aq: 

CL = ^ (2.6) 

Let /o be the spatial frequency corresponding to Aq, that will be the working 
wavelength of the modulation. If L ~ 2 ■ CL is the modulated path correspond- 
ing to the central lobe of the interferogram, rounded to an integer number 
of fringes, then the following statements hold: 

L 

L = uq ■ Xq — >■ Ao = — 

no 

/„ = ^ ^ = (2.7) 
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The number of fringes hq will be ~ 2-CL/Ao. Then the mean number of counts 
n{j) can be written, according to the simple interferometric model introduced 
in chap. [U 



with Xt the mean number of the pixel values over a number of recording, 
comprising both signal and background, and V the fringe visibility. 
The mean values of -R(no) and /(no), i.e. the real and the imaginary parts of 
X(m) when m = Uq, are given by: 



thanks to summation over complete periods. From this equation the phase 0, 
the visibility V and the mean flux xt information can be retrieved, in a similar 
procedure as ABCD scheme. 

In low flux regime, the counts register can be modeled as Poisson variables, 
which mean can be approximated with an average of pixel values, recorded 
subsequently. Assuming fluctuation noise negligible and background noise as 
independent upon the signal counts, it can be shown that the mean of the 
real and the imaginary part of the FT follows a circular gaussian distribution, 
and it is possible to give a measure of the error in the estimation of the fringe 
parameters. The analytical derivation can be found in pS] . 
These estimation, however, are valid for low flux level, and to reach them it 
is necessary to average over a number of measurements, and so it is not so 
useful when phase information are needed at high frequency rate, and there is 
no time to perform a good average. 

The basic idea to retain is that the Fourier transform of data contains infor- 
mation on fringe parameters. 




(2.8) 





(2.9) 
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2.1.3 PFSU and LAMP 

The first fringe sensor concept proposed for the VLTI has been LAMP ([21]), 
acronym for Large Amplitude Modulated Path, developed by the Observatoire 
de la Cote d'Azur (Nice, France). The fringe sensor is devoted to detection of 
the error on the optical path and to communicate its value to the dedicated 
OPD corrector. 

The basic idea of the LAMP algorithm(|18j) is that it is possible to retrieve 
information on the fringes parameters through the spectral analysis of the flux 
intensity data. To use this technique, it is essential to modulate the optical 
path over several wavelengths. In this way, the resulting signal contains both 
information about the phase of the white fringe (cophasing) and about the 
absolute position in the envelope of the polychromatic fringes (coherencing). 
The control range plays an essential role. If the modulation path is larger 
than the coherence length of the fringes, the interferogram shape does not 
show truncation effects. In fig. 12.21 a truncated interferogram is shown: the 
OPD scan is smaller than the coherence length, so the number of fringes is not 
integer, the total energy of the recorded modulation path is not maximum. 




OPD [micrDns] 

Figure 2.2: Interferogram over an OPD scan smaller than the coherence 
length. 



The monochromatic interferogram produced by the modulation of the optical 
path will be expressed as: 



27r£ 

1 + Kcos ( — ^ + 

-^0 



0<^<L 



(2.10) 



where ^ varies in the modulation range, Ig is the signal intensity (source and 
background), Vg is the overall visibility (different from the effective visibility 
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of the source), and is the unknown phase of the fringe. 

We assume that the phase varies slowly with respect to the modulation period, 
otherwise the phase values given by the algorithm would be a sort of mean of 
the varying phase, and the correction would be useless. 

The procedure to find the phase modulo 27i is analogous to that of Goodman, 
with the obvious difference that we are working with a temporal modulation, 
and not with a spectral dispersion, which allows to have a spatial distribution 
at each time instant. 

The modulated path is a symmetric triangular periodical function of time 
(sawtooth) with frequency /o and amplitude uoXq, chosen approximately equal 
to twice the coherence length (L ~ 2 ■ CL). Note that uq is an integer, so the 
modulation is done over an integer number of fringes. Let N be the total 
number of samples covering L. 

The signal component at frequency /o carries the information about the phase 
0, so it is detectable with the analysis of the corresponding component of the 
Fourier transform ([26]): 

N 

r^= cos(27r^/ Aq + 0) • sin(27r^/ Aq) = ^ cos(27r^j/ Aq + 0) ■ sin(27r^i/ Aq) = 
^ i=i 

1 ^ 1 

= -J2 sin(27re./Ao) sin(0) = -- sin(0) (2.11) 

1=1 

thanks to the summation over an integer number of fringes, i.e. over the 
complete range [0,27r]. In the same way: 

V = y cos(27r^/Ao + 0) ■ cos(27r^/Ao) = ^ cos(0) (2.12) 

Apart the factor 1/2, these are the real and imaginary part of the complex 
number e"* "'^. We find the phase and the corresponding OPD simply as: 

= arctan(-^) ^ OPD = —0 (2.13) 

20 2-K 

Note that this is a measure modulo 2tt (or A), for the properties of the 
arctangent function. To do coherencing, i.e. to find the absolute position 
within the coherence length, a proper combination of the lateral frequencies 
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f±i = 27r(no ± 1)/L = /o ± 27i/L is needed. In an analogous way as before, 
the modulated interferogram is filtered to find the phases ± ^ modulo 27r. 
The differential value is ^ modulo 27r, from which we finally obtain modulo 
2no7r, i.e. modulo the coherence length. 

This last measure is complementary to the phase modulo 27r because it gives 
the position of the intensity maximum over all the coherence length, while the 
phase modulo 27i does not guarantee that the evaluated phase is a secondary 
maximum in a lateral fringe, instead of the central peak. 

Two aspects have to be stressed. First of all, the presence of an integer number 
of fringes is important to filter out the modulation path. Then, these formula 
are based on the modulated part only, and do not include the offset of the 
interferogram Is or the amplitude of the modulation [Ig ■ Vs). This means that 
the original signals have to be normalized before using them. 



2.2 Algorithms for FINITO 

After an introduction on the VLTI FINITO fringe tracker, we describe the 
location algorithms that we proposed for the VLTI FINITO fringe tracker. 
The first is based on the demodulation algorithm, the second on a correlation 
with a template. 

2.2.1 Instrument description 

FINITO (Fringe tracking Instrument of Nice and TOrino) is a fringe sensor 
unit developed by the Osservatorio Astronomico di Torino, in collaboration 
with ESO. It is based on the PFSU laboratory prototype. 
It is an interferometric instrument based on amplitude combination of either 
two or three telescopes beams. It operates in H band (A e [1.48 — 1.78] fim). 
Figure [213] shows the overall scheme of the instrument. The astronomical beams 
are injected into monomode and polarisation maintaining optical fibres. The 
fibres act as spatial filters, because the injection system retains just the cen- 
tral lobe of the incoming wavefront, and reject the most aberrated lateral one. 
The optical paths of the two beams are modulated by piezoelectric devices, 
that stretch the fibres in order to modify the paths. The differential path 
is monitored by a metrology system, based on the superposition of a laser 
source at A = 1.31/im. This laser beam shares the same optical path of the 
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astronomical beams during the modulation, then it is removed and separately 
combined, and the current differential path is achieved and used for correction. 




Figure 2.3: FINITO layout 



Therefore, the modulation applied to the astronomical beams through this 
closed control loop can be considered as ideal. In turn, the phase detected on 
astronomical beam combination is a measurement of the external disturbances 
which are to be compensated for stable integration on science combiners. 

After the separation of metrology from astronomical beams, the two polariza- 
tion components of each beam are separated; one is retained for photometry 
purpose, while the other is sent to a beam combiner. This approach avoids to 
deal with phase differences between polarization components. 
In the beam combiner, one of the three input beam is split and superposed 
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separately with the others. The former acts as a reference beam, and the in- 
terferometric signals allow detection of the relative phase of each beam pair. 
Four interferometric outputs are produced, two destructive and two construc- 
tive; they are finally focused on a detector, together with the three photometric 
outputs. 

The goal of FINITO is the measurement in real time, allowing correction by 
the OPD controller, of the perturbation to the optical path, caused in primis 
by atmospheric turbulence that affects photons by the stars. The instrument 
adds some disturbance, such as modulation and readout noise (RON); optical 
elements can produce fluctuations of the phase and the amphtude. Moreover, 
all these factors limit the performance of the instrument. 

Its set-up is optimized taking into account the operational conditions: the scan 
amplitude should be comparable with the coherence length (for the H band, 
i.e. [1.5 — 1.8] /JUi, with AA = 0.3 fim and central wavelength Aq = 1.65 fim, 
the number of fringes is about 2Ao/AA ~ 10), the fringe scanning rate must be 
faster than the typical atmospheric turbulence, even if higher rates correspond 
to shorter integration times, and so to lower sensitivity. 

Different algorithms have been proposed for the evaluation of the current opti- 
cal path difference to be compared with the modulation one. The mandatory 
request for all the algorithms is to execute in real time, so they can't make 
use of too many computations. In the following sections, we review them with 
their performances. 

2.2.2 Demodulation algorithm 

The demodulation algorithm proposed for FINITO is an implementation of 
the LAMP concept. 

Let us consider the following simplified description of the two complementary 
outputs of each metrology beam combination: 

F{P) = l{Ii + l2 + 2-V 
G{p) = ^{/i + /2 + 2-y 

where F{p) and G{p) are the signals from the constructive and the destructive 
outputs, /i,/2 are the incident beams intensities, V is the overall visibility 



• ^Jh ■ hmip) cos(27r— — -)} 

M 

■ y/h ■ hmip) cos(27r^-^ + tt)} (2.14) 

Ao 
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(composition of the source and of instrumental visibilities), p is the optical 
path difference, pi is an additional OPD that comprises the contributes of 
different kind of noises, Aq is the central wavelength of the working range and 
m{p) is the envelope, i.e. the contribution to the fringe pattern due to the 
polychromaticity of the beams. The two signals have a phase difference of vr. 
We can rewrite them as: 

F{py, G{p) = ^ |/i + /2 ± 2 ■ y • ^/h^2m{p) cos (27r^^^ | (2.15) 

that explains the "constructive" and "destructive" definition of F{p) and G{p). 
If we subtract F{p) and G{p), we eliminate the common offset: 



In order to apply the LAMP concept, we have to make some adjustments. 
First of all, the identification of the amplitude factors is needed, in order to 
define a proper filter for the Fourier analysis. If we use a simple sinusoidal 
wave, as explained in section 12.1.31 the results are worsened by the lack of 
matching between the sinusoidal wave and the interferogram. Figure [23] shows 
the results of a simulation of the OPD evaluation using a demodulation over 
10 fringes, with a step of a twentieth of fringe, when the modulation law is 
a ramp and the introduced phase error is a constant atmospheric piston of 2 
nm. 

The bad quality of the phase error evaluation is due to the shape of the inter- 
ferometric beam, which presents the envelope pattern. In the area near to the 
zero OPD, this effect is weakened because the interferometric intensity has a 
maximum, together with the envelope. 

We then define a shape for m{p). The polychromatic spectrum is a superpo- 
sition of monochromatic components uj{\,p): 



In the nominal case, wavelengths in the selected range give the same contribute 
to the polychromatic beam, while all other wavelengths outside the band give 
no contribution: 




(2.16) 





1 if Ai < A < As 
otherwise 
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evaiuated phase 
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Figure 2.4: Phase error evaluated with the demodulation algorithm, using 
as templates simple sinusoidal functions (sin and cos) at same frequency 
than the central fringe. The introduced phase error is constant (2 nm). The 
abscissa axis reports the number of points over the OPD scan, while on the 
ordinates the evaluated phase amplitude is shown. 



The function uj{X,p) acts as a perfect rectangular filter: 

fTT-ip) = / uj{X,p)dX = / uj{X,p)dX. 



We are in the situation described in par. 11.3.21 of chapter [H The resulting 
model for m{p) is a sine function: 

, , sirnix 
mip) = 2.17 

Tlx 

where x is the ratio OPD to coherence length: x = OPD/CL. 



We define two functions for the demodulation, tailored on the modulating 
envelope: 

ti{x) = sin(27rx/Ao) ■ sinc(7rx) 

hi^) = cos(27rx/Ao) ■ sinc(7ra;) (2-18) 

in order to better suit the shape of our interferogram, and we apply the LAMP 
concept, filtering the signal S{p) with the modified templates ti{x) and t2{x). 
Note that the subtraction of the destructive and constructive waves cancel out 
the common offset; if this offset is not equal, it must be evaluated (for example, 
by photometric measurements) and eliminated. 
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Expected performances 

As for the ABCD case, also for this algorithm the performance is influenced 
by the SNR. An estimation for the minimum detectable phase can be found 
in [IH] and is given by: 

Piston variation below this limit cannot be recognized by the algorithm, so 
this is a limiting performance. 



Performance 

The performance of this adapted algorithm depends on the type of noise that 
affects the optical path difference or the interferogram intensity, and so is 
reflected in the SNR. 
Different kind of noises are considered. 

• The photonic noise is linked to the particle nature of light and to the fact 
that the arrival time of the photons is random. Under some conditions 
(semi-classic theory of detection, see [23] for references), it follows a 
Poisson distribution with rate proportional to the square root of the 
intensity of the incident light. 

• The scintillation is defined as a variation in the intensity of the flux 
collected by a telescope. It can be caused by refraction effects in atmo- 
spheric layers, especially in small structures caused by turbulent phe- 
nomena. It depends from the position of the source in the sky, and from 
observational conditions. It can be modeled as a time sequence having 
a slowly decreasing spectrum s{6), depending also from the wavelength: 
s{9) oc Aq ■ 9~^^^ in order to simulate also high-frequency turbulence 
typical of the atmosphere. 

• The RON (Read Out Noise) is caused by the detector, and can consist 
in the uncertainty of the digitalization as well as small charges induced 
by electronic components. The easiest way to model it is with a con- 
stant value representing a statistic of the error on the detector read out, 
directly proportional to the integration time and the flux intensity 

• The shotnoise is linked to the quantization of the receiving matter. An 
approximation is with a random variable drawn from a normal distribu- 
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tion of mean zero and standard deviation a = This formula takes 

into account the SNR: the bigger it is, the smaller is the shot noise. 

• The atmospheric disturbance spectrum has been studied for long (for 
VLTI environment, see, e.g., It is function of the geographical 

location and of weather conditions. An approximation of the low fre- 
quencies (/ < 100 Hz) of such noise is given in figure 12.51 The power 
spectrum density of the atmospheric variation is proportional to f~s 
and to atmospheric parameters, such as the wind speed and the Fried 
parameter (a measure of the coherence length of the atmosphere). The 
phase is randomly distributed following a uniform distribution in the 
range [— vr, n] . 

introduced phase 

1.5 

1.0 

S 0.5 
E 

0.0 

-0.5 

50 100 150 200 

number o1 cases 

Figure 2.5: Stochastic sequence of atmospheric induced phase. 




In the following simulations, an interferometric signal is generated using the 
description of equation 12.141 and the following parameters: 

• working wavelength: A = 1.65 fim in H band 

• source magnitude = 13 

• estimated visibility: 0.72 

• flux at zero magnitude: 4.8el0 

With the listed parameters, the reference performance for the OPD is given 
by: 

1 65 

aoPD = ■ = 0.036/im (2.20) 

^ 27r-0.72-10 ^ ^ ' 
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We first do not add any noise on the amplitude, to assess the nominal perfor- 
mances of the algorithm. We can see (figure 12.61) that when the introduced 
phase is a constant (0.3 yum), the error between the evaluated phase and the 
introduced one is of order of 10~^ fim. 

We now add an atmospheric disturbance on the OPD, with the same features 
of the one shown of figure 12. 5[ Of course, the estimation error increases, and 
also shows a pattern inherited from the introduced OPD (figure l?!T|) . However, 
it is well beyond the reference limit set by eq. 12.201 since its standard deviation 
is of order of 10~* fim. 

Therefore, the noise induced by model/algorithm error is quite negligible with 
respect to that associated to physical noise. 




Figure 2.6: Left, the phase evaluated by the algorithm, right, the error 
between the evaluated and the introduced one. The mean value of this last 
discrepancy is: 9.6 • 10^^ /im and the standard deviation is: 2.8 • 10~^ ^m. 
The two graphics have different ranges of values for comprehension reasons. 




Figure 2.7: Left, the phase evaluated by the algorithm, right, the error 
between the evaluated and the introduced one, when the introduced noise is 
of the kind described in figure 12.51 The mean value of this last discrepancy 
is: 9.6 • 10"^ and the standard deviation is: 2.8 • 10~^ jim. 



We finally add also noises on fringes amplitude, both observational (photonic 
noise and scintillation) and instrumental (RON and shot noise): 
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• RON = 10 photons/sec 

• photonic noise with rate equal to the square root of the flux of each 
incoming beam 

• shot noise, modelled as a normal variable with a = J]^, with SNR = 10 

• scintillation, following the previous description. 

Even if the interferogram shows a noisy pattern, due to shot noise and pho- 
tonic noise, the performances of the algorithm are still quite good (figure . 
Again, its standard deviation (9.2 nm) is in accordance with the reference one. 




-10 -5 D 5 10 50 1D0 150 200 50 100 150 200 



Figure 2.8: Left, the noisy interferogram; centre, the introduced phase 
(sohd Hne) and the evaluated one (dotted), right, the error between the 
evaluated and the introduced one. The mean value of this last discrepancy 
is: 3.8 • 10^"^ fim and the standard deviation is: 9.2 • 10^'' 

In order to figure out features of the algorithm, we generate an high number 
of intensity noise realizations (1000). For each of them, the interferogram is 
generated with the same constant OPD deviation added to the optical path. 
This is not a realistic case, but it is easy to understand and to control. As we 
could expect, the standard deviation of the evaluation error, averaged on the 
number of realizations, depends upon the current OPD, i.e. from the distance 
from the zero OPD, where the flux intensity reaches its maximum (figure [2?9l) . 
If the OPD noise is not constant, this feature is covered by error threshold due 
to the atmospheric OPD. 

Spectral performances 

Our task is now the comparison of the power spectral densitj0 (PSD) of the 
atmospheric phase with the PSD of the evaluated phase, to check how spectral 

description of PSD and its principal features can be found in chapter |4] 
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Figure 2.9: Standard deviation of the error on the evaluation of phase as 
a function of the OPD 

features of the introduced phase are reproduced in the spectrum of the evalu- 
ated one. 

We will first analyze the spectral characteristic with a noisy interferogram, 
but with a poor realistic noise, i.e. a linear ramp, in order to check particular 
behaviour of the algorithm. We will then use a realistic noise, checking the 
influence of the noise on intensity comparing the algorithm performance on 
interferogram with or without intensity perturbance. 

So, let us begin adding to the interferometric signals the intensity noises de- 
scribed in the previous paragraph: realizations of shotnoise with a SNR=10, 
photonic noise and scintillation, drawn as a noise with a decaying spectrum 
6^"^). The sampling is set at 4 kHz, with 20 samples for fringe (a fringe in 
5 ms, i.e. 200 Hz). 

We first introduce an atmospheric phase drawn as a linear ramp with coeffi- 
cient 0.5 of the modulated path. 

The PSD behaviour remains the same before and after the algorithm applica- 
tion (see fig. 12.101) . We have to notice, however, a peak in correspondence of 
the 200 Hz frequency, corresponding to the fringe frequency. This is clearly an 
artefact, but it can be recognized on the evaluated OPD, too, as a superposed 
sinusoidal pattern over the linear shape. This is a residual of the interferomet- 
ric modulated component. 

We now perform the more realistic simulation adding the atmospheric noise 
described in figure 12.51 to the optical path. 

For both cases described above (nominal and noisy intensities), we generate 
a statistic of = 200 noise realizations, we evaluate the phase for each of 
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Figure 2.10: Phase evaluation on signal with noisy intensity. Left, intro- 
duced versus evaluated (dashed) phase; right, the respective PSD 



them and we compute the PSD of the two series. Finally, we average all 
PSDs averaged over the number of realizations. Fig. 12.111 shows this averaged 
PSD for the nominal (left) and corrupted (right) interferometric intensities. 
The left picture reveals that the algorithm augments the noise on the phase, 
especially at higher frequency. It seems, however, that this added noise source 
affects all the frequencies over a threshold. A similar situation is the case 
of interferometric outputs with perturbation on the intensities (figure 12.111 
right), but the added noise is greater, i.e. the offset between the PSD of the 
atmospheric phase and of the evaluated one is greater. 

This is due to the modeling of disturbance on intensities. In fact, the shotnoise 
is based on the realization of a normal random variable, the photonic noise is 
designed as a Poisson variable (even if, at this flux level, it is approximated by 
a normal variable too), so these components have a flat spectrum. 
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Figure 2.11: PSD of introduced and evaluated (dashed) phase in two dif- 
ferent simulations: without noise on interferometric intensity (left) and with 
(right). For both, the noise on the optical path is a realistic one (figure [275]) 
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Remarks 

The magnitude of the source is an important variable. Given its flux intensity 
at the reference magnitude (zero magnitude) for a wavelength, the flux inten- 
sity of the light reaching the detector surface is linked to the magnitude m by 
the relation: 

/m = /0-10-^. 

However, this algorithm is robust against the source flux intensity, because the 
flux offset must be eliminated before applying it. At the same time, it requires 
a good knowledge of this offset. Another hmit is the need of integration over 
complete fringes, in order to be able to apply the trigonometric relations. 
Therefore, a good knowledge of the source spectrum is also necessary, to deflne 
a correct effective wavelength. 

We remark, finally, that in practice a suitable effective spectral bandwidth with 
appropriate intensity can be used. This produces different templates ti and t2, 
but allows to adapt this algorithm to a wide range of different instruments. 

2.2.3 Correlation with a template 

Another algorithm we proposed for the detection of the phase modulo A with 
FINITO is the 'correlation method'. It is based on the correlation between 
the measured interferogram and a template, in order to find the maximum of 
the correlation function. The OPD correspondent to that maximum in the 
template is the current OPD. 

Its greater advantage with respect to the demodulation algorithm is the fact 
that it doesn't require the modulation over an integer number of fringes, re- 
laxing the requirements on the knowledge of the working spectral range. 

The correlation is limited to a single fringe. At the beginning of the scan, the 
interferogram intensity is collected for a fringe, then the template is correlated 
to find the actual phase on the OPD. For the next step, every collected sample 
can be added to the fringe pattern, discarding the oldest one, and correlated 
again with the template. In this way, there is a loss of information just for the 
time needed for the first fringe collection. 

The maximum of the correlation function is found searching the maximum 
of the interpolating polynomial, of second order being the involved functions 
sinusoidal. 
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Being the correlation limited to a single fringe, the template does not need to 
be modulated. Besides, the introduction of modifications to take into account 
envelope modulation, or slight departure from the full fringe assumption (i.e. 
variations of the effective wavelength), is straightforward. 

We first seek for the behaviour of the algorithm when there is no noise on the 
OPD modulation, and on the intensity pattern. The input interferogram is 
modelled following the description of eq. 12.141 and with the following param- 
eters: 

• nominal intensities for Ji and J2: 3.085 ■ 10^, for magnitude 13 

• visibility V = 0.73 

• waverange: H band, with central wavelength Aq = 1.65/im 

• quantum efficiency: 60% 

• sampling frequency: 4 kHz 

• fringe per semi-ramp: 10 

• sample per fringe: 20 

• OPD range: 16.5/im, i.e. 10 fringes 

and the template is constructed as the sinusoidal wave: 

27r 

t{x) = sin{— ■ OPDt) (2.21) 
Ao 

where the subscript i for the OPD is the same stepping of the modulation 
ramp (a twentieth of Aq). 

Figure 12.121 shows the evaluated maximum of the correlation function. The 
most remarkable result is the superposed oscillatory behaviour. Its mean is of 
— 3.3nm and its frequency is roughly double the fringe frequency. This effect is 
probably due to a beating between the interferometric signal and the template. 
This is the best performance of the algorithm in nominal conditions, and its 
performance is still good, compared to the ideal case. The model induced error 
is acceptable, in most observing case. 

When the introduced phase on the OPD is no longer zero, we find again the 
oscillatory phenomena, even if the algorithm is able to follow the OPD pattern. 
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Figure 2.12: Error on the phase evaluated by the algorithm when no noise 
is introduced over the modulated OPD. The mean error is —0.003 /im, its 
standard deviation is 0.003 /im. 

Different cases are shown below (see figure 121131) . with a hnear phase on the 
modulated OPD, and then with a more realistic noise (see fig. 12.51) . 
Figure 12.141 shows what happens when the introduced phase induces the over- 
all OPD to be greater than A, i.e. when the intensity jumps in a lateral fringe 
instead of remaining into the central one. This phenomenon is called fringe 
jump. To simulate it, we generate the same atmospheric noise than before, but 
with a greater amplitude (factor 10), so that the noise induces big shift of the 
whole interferogram. 

Finally, we perform a realistic simulation adding both atmospheric turbulence 
on the OPD and noise on the interferogram intensities. The results are shown 
in figure 12.161 

We have to notice, however, that for atmospheric noise the performance is 
worsening, even with nominal intensity. 

Analitically, it can be convenient to model the interferometric beam and the 
template as sinusoidal waves at different frequencies pS]: 

Si(t) = C0S([CJ„ + Uatm] " t) 

st{t) = cos(w„-t) (2.22) 

where Um = is the frequency of the modulation, and N is such that t covers 
a wavelength in a period T. At the same way, Uatm is the unknown atmospheric 
frequency. 
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Figure 2.13: Linear noise on the OPD. nominal intensity. Left: input phase 
on OPD, center: evaluated phase, right: difference between the previous. 
The mean error is 0.2 nm, its standard deviation is 0.002 /zm. 




10 



Figure 2.14: Atmospheric noise on the OPD, nominal intensity. Left: in- 
put phase on OPD, center: evaluated phase, right: difference between the 
previous. The mean error is —0.122 /zm, its standard deviation is 0.130 /xm. 




Figure 2.15: Atmospheric noise on the OPD, nominal intensity. Left: in- 
put phase on OPD, center: evaluated phase, right: difference between the 
previous. The graphic range is different from before, since the amplitude of 
the noise is greater than l2.14l of a factor 10. The mean error is 0.283 /im, its 
standard deviation is 0.495 /im. 
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Figure 2.16: Atmospheric noise on the OPD, noisy intensity. Left: in- 
put phase on OPD, center: evaluated phase, right: difference between the 
previous. The OPD noise does not cause fringe jumps. The mean error is 
—0.121 nm, its standard deviation is 0.129 /xm 



The correlation function between the two beams is: 

-T+T fT+T 



/T+i rr+i 
Si{t) ■ St{t + (l))dt = / COs{[Um + UJatm] * t)cOs{LJni ■ t + (l))dt = 

t-T+T 

= COS(0) / COsiXuJm + OJatm] * t)cOs{uJm ■ t)dt + 

/T + T 
COs{[uJrn + ^^atm] * t)sin{Um • t)dt (2.23) 



Maximum of the correlation function are among the zero of the derivative 
function 



/T+T 
COs{[uJm + OJatm] * t)cOs{uJm ■ t)dt + 

t-T+T 

COs{(f)) / COs{[uJrn + ^atm] * t)sin{uJm ' t)dt = (2.24) 



from which we obtain: 



jl^'^ COs{[uJm + OJatm] * t)sin{uJm ■ t)dt 



tan{ct>) = , (2.25) 

COS[[uJm + OJatm\ * t)cOS[uJm ' t)dt 



which leads to: 



where we have defined: 



a = -— , 7 = -—7: 2.27 
<^m a + 2 
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A closer look to that formula leads us to a few considerations. 
First of all, we recognize the doubled modulation frequency that we have dis- 
covered on the evaluated phase. This factor is generated by the algorithm 
itself, since the other terms depend just on the introduced phase, referred to 
the initial condition [uatm.''') , and can be weakened by taking as reference the 
measurement of the OPD at time r. 

If we assume the atmospheric frequency to be low with respect to the modu- 
lation one, so LJatm -C LJm, then a ~ 0, 7 ~ 0, and we are left with: 

tan{(f)) ~ tan^UatrnT). 

However, in our first simulation no external phase was introduced. In that 
case the error had a constant threshold and a modulation. This effect can 
be caused by the fact that the template is calibrated on the central fringe, 
while the interferogram is polychromatic, and the effect of polychromaticity is 
stronger at increasing distance from the zero OPD. 

Remarks 

This algorithm provides quite good results, but has a threshold error that can't 
be avoided even in nominal situation. This is due to the shape of the template 
for the correlation, which relies only on the introduced modulation path and 
the working wavelength, that should be known and sufficiently stable. How- 
ever, a more detailed template should require a proper calibration of fringes 
parameters such as intensity, spectral range, and so on. These enhancements 
depends, however, on the knowledge of the instrument. 

Oscillations are often present, and in many cases sufficiently small. They are 
due basically to the beating induced by the mismatch between the real fringe 
frequency and the modulation speed or the external OPD rate of variation. 

2.2.4 Modified ABCD 

The AC and ABCD algorithms are modifications of the simple trigonometric 
equations that apply in the ideal case (eq. 12. 2p . With these algorithms there 
is no need to detail the model of the incoming beam, but its normalization is 
mandatory, otherwise the trigonometric relations are lost. 
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In fact, the aim of the normahzation is to bring the signal to a sinusoidal wave 
with unitary amplitude and zero mean. There are two scenarios: if photometry 
is available (as the case of FINITO) and if not. 

In the first case, the photometry values can be used in real time for the correc- 
tion of the interferometric outputs. This method is used, e.g., by the VINCI 
interferometer (^14j), and by FINITO. 

In the latter case, information about the offset and the current amplitude of 
the interferometric signal must be deduced independently, e.g. with a slow 
calibration off-line. This is the case of the PRIMA FSU, and will be discussed 
later. 



If the photometry is available, the interferometric outputs can be described 
in terms of the photometric ones. A detailed derivation of a calibrated inter- 
ferometric pattern when working off-line can be found in [H]. In this case, 
the working conditions are relaxed, and data can be carefully denoised before 
using them. 

When working in real time, however, the operations must be reduced to the 
minimum. This is the case of FINITO, and we illustrate it. 
If Pa and Pb are the photometric inputs, and Ji and I2 the interferometric 
signals after the combination, then a simple normalization is, derived from 



jnorm _ ~ PlA^A - Pi,bPe 
-'1 — 



VPI^ 

jnorm -^2 " (^2,aPa - /32,bPb ,^ r,Q\ 

- — 7Km — 

The j3 coefficients have to be evaluated before the observation, monitored and 
periodically updated. They intrinsically contain information about the source, 
e.g. the wavelength distribution, but also instrumental ones, such as the cou- 
pling ratio of the photometric beam splitters, the efficiency of the transmission 
system from the combination to the detection. Their values can be influenced 
by all these factors. 

After the normalization, eq. 12.21 can be used, acquiring four samples for fringe, 
separated by a 7r/4 offset, through a modulation of each interferometric out- 
puts, for example. 

With this approach, the expected performance is that of ideal ABCD (see 
eq. 12.31) . In that formula, information on the incoming flux and its noise 
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are contained in the SNR term. If we suppose that the astronomical beam 
is subject to photon noise, background and detector noise, the SNR can be 
expressed as[30j: 

SNRv^ = ^ S-N,-V ^ 

where Np, Nb and Nd are the beam flux, the background flux and the standard 
deviation of the detector noise, respectively, all expressed in terms of number of 
detected photons, S is the Strehl ratio, i.e. the ratio of the reference intensity 
to the measured one, and V is the visibility. 



Application of ABCD method for FINITO at VLTI 

As mentioned before, the FINITO performance when installed at VLTI suffered 
for the bad working conditions. For this reason, the ABCD method has been 
chosen, among all proposed, for its robustness. 

However, this algorithm is sensitive to residual effects due to normalization, in 
particular to the interaction between the photometric signals PA and PB. In 
presence of such correlations, it is difficult to theoretically estimate their effects 
on the performance. This was the case of FINITO, subject to unexpected 
flux fluctuations [29]. Figure [2.171 first row, shows the calibration coefficients 
evaluation during one observational night: it is evident that they are changing. 
The residual noise from calibration, computed as the standard deviation of 
the normalized signals of eq. 12.281 compared with the standard deviation 
of the photometric inputs reveals that new features where added with the 
normalization. Figure 12.171 second row, shows the spectral behaviour of these 
two noises. 



2.3 Monitoring of intensity fluctuations 

For the reasons mentioned in the previous paragraphs, the need of intensity 
fluctuations control arises. In fact, if it is possible to rapidly get aware of a 
significant flux variation, the coefficients for the normalization can be updated, 
in order to better fit the injected beams features. 

In this section, we propose [31] two different intensity estimation algorithms 
based on the estimated phase. We analyze the effects of a discrepancy of the 
intensity on the estimation of both phase on the optical path and on intensity 
itself, in a ideal situation. In fact, for every algorithm seen till now, the 
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Figure 2.17: First row, calibration coefficients monitored for one night; 
second row, comparison between the spectral behaviour of the noise due to 
normalization and the standard deviation of photometric channel. These 
figures are taken from [23] . 





knowledge of the current flux intensity is crucial, and we have seen how noise 

on intensity causes in turn an error on the OPD estimation. 

Let us describe the two complementary outputs on an interferometric system 



as: 



Si(x) 
S2ix) 



I ■ 
I ■ 



, 27r 
1 + K sin I —X 

1 + V cos ( —X 



(2.30) 



In the equation, I is the flux intensity, V is the visibility, A is the working 
wavelength and x is the optical path difference. In ideal case, /, V and A are 
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constant. So we can easily find the phase = as: 



rriiix) A , -,\ 

arctan -— x = — cp (2.31) 



where we have defined: 



/27r 

rai{x) = si{x) — I = J -l^ sin (—a; 

m2{x) = S2{x) - I = I-Vcos(^x^ (2.32) 

Let now allow the intensity to vary: / = Jq + 6Iq, where Jq is the known nom- 
inal intensity, while 6Iq is unknown. We find an estimation of the intensity / 
first minimizing an error function, then with trigonometric elaboration of the 
original signals. 

For the first approach, if we subtract from Si{x) and ^2(3;) the Jq nominal 
intensity instead of the true /, we obtain the following signals: 

/27r 

mi{x) = si{x) - Iq = 6I0 + {Iq + SIo)V sin I— X 
m2{x) = S2{x) - Jo = 6I0 + (/o + 5/0)^ cos (^yX^ (2.33) 

Substituting them into eq. 12.311 we obtain a perturbed estimation of the 
phase (j): 

rriiix) 

= arctan ^ ) ! (2.34) 
With this estimate of the phase, we construct two sinusoidal templates: 

ti{x) = Iq ■ V sincp, t2{x) = Iq ■ V cos (j) (2.35) 
that we use to find the intensity value that minimize the squared error 

1=1,2 

The minimum is found searching the zero of the derivative function We 
find: 

~, , miix) ■ sin 6 + rh2(x) ■ cos d) 
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This evaluation suffers for the low number of measurements for each estimate. 
It is nevertheless interesting because it allows an iterative process for estimates 
of both phase and intensity: 

mi^i{x) = si{x) - /i-i(x), m2,i{x) = S2{x) - ii-i{x) 

~ mi^i{x) 
(pi = arctan — -— 
m2,i{x) 

~ _ mi^ijx) ■ sin (pj + m2,i{x) ■ cos (pj 

The second iterative method we analyze is based on the manipulation of the 
equation sf^x) + sl{x), which leads to: 

^ , 9^ sl{x)+sl{x) , 1 = n O-^R-) 

J2 Jo /2[2 + V^2 + 2r(sin0 + cos0)] ^ 

which has two solutions: 



, / sl{x) + sl{x) 

/o Y /2[2 + l^2 + 21^(sin0 + cos0)] ^ ' 

For < V < 1 and V0, we have 2 + V'^ + 2K(sin + cos 0) > 0, so the two 
solutions exist and they are real. The factors of the ratio in the square root 
are comparable, the only difference is the intrinsic influence of the perturbed 
/ instead of Iq in sf^x) + sl{x): the ratio is near zero. So the two solutions 
are: 

^~0, ^--2. (2.40) 

-'0 -'0 

The first one is the searched solution. 



Also this estimate of the intensity / can be used in an iterative process similar 
to that described in eq. 12.371 

Figure 12.181 shows some iterative steps in the evaluation of both phase and 
intensity, for the two described methods. In the simulation, the model param- 
eters take the values listed in table 12. 1[ 

The optical path difference is given in terms of angles (radians) . From the ana- 
lysis of the first row of the graphs (first method), we can notice a superposed 
sinusoidal error, both for phase and for intensity. It is at the same frequency 
of the fringe, it is probably due to the use of an estimated initial phase. From 
the second iteration the frequency doubles, but then it remains fixed. It can 
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visibility 


0.9 


nominal intensity /q 


12.34 (arbitrary units) 


optical path diff (rad) 


[-71, -n] 


noise on intensity 


constant (-2% of /=0) 


noise on OPD 


null 


number of samples 


1000 



Table 2.1: Parameters of fringes in the simulations. 



Iteration 


Mean(0i) 


[rad] 




Mean(/) 


0-/ 


I 


-8.95e 


-5 


0.0226 


12.096 


0.2743 


II 


-8.94e 


-5 


0.0173 


12.091 


0.2777 


III 


-8.93e 


-5 


0.0177 


12.091 


0.2778 



Table 2.2: First method: error function minimization 



be due to the fact that the first method foresees the comparison with a sinu- 
soidal template at frequency close to that of the signal. The iteration saturates 
immediately, without any further improvements. 

The second method shows a better convergence of phase toward the nomi- 
nal value. However, after some iterations the convergence decreases and then 
stops. We can notice that the intensity, apart from being modulated with more 
than a frequency, it is always overestimated. Tables 12.21 and 12.31 give a resume 
of mean and standard deviation of the estimates of both phase and intensity 
for the two methods. We can see that the standard deviation of the intensity 
is better (by a factor 10) with the second method. 

Finally, we remark that both these methods are based on a simple model, that 
does not foresee a superposed envelope, so they are apphcable in the central 
fringe. The model extension to a more complex function to get rid of the beat, 
and to cope with envelope modulation, is conceptually simple but was not yet 
carried on; it remains thus as part of the possible future developments. 



Iteration 


Mean(0j) [rad] 


[rad] 


Mean(/) 


0-/ 


I 


8.95e-5 


0.0226 


12.218 


0.0873 


II 


8.95e-5 


0.016 


12.186 


0.0902 


III 


8.8e - 5 


0.0136 


12.171 


0.0886 


VI 


8.7e-5 


0.0122 


12.161 


0.0864 



Table 2.3: Second method: analytical solution 
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Figure 2.18: Phase (left) and intensity (right) iterated estimates for the 
error minimization method (first row) and for the analytical one (second 
row). 
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Chapter 3 



Location and calibration 
algorithms for the VLTI 
PRIMA Fringe Sensor Unit 

The Observatory of Turin has been involved in the design and development 
of a fringe sensor unit for the VLTI PRIMA instrument, the PRIMA FSU. 
We will describe it in Sec. 13.21 The location algorithm proposed for this 
fringe sensor is based on the comparison of the current interferometric pattern 
with a tabulated template. The difference with the algorithms described in 
chapter[2]is the use of a highly detailed interferometric model. Of course, being 
based on a significant number of variables, the needs of their calibration and 
monitoring are much more demanding. However, good results can be obtained, 
in terms of model accuracy and variables estimations. In this chapter, we 
will introduce the interferometric environment used for PRIMA FSU: model, 
algorithms and calibration tools. My work has focused especially on the last 
task, i.e. development and validation of diagnostics and calibration tools, both 
with simulated and laboratory data. 



3.1 Introduction of a detailed interferometric 
model 

In the previous chapter we have analyzed some algorithms for the location 
of the fringe position with respect to the OPD scan. We have noticed that 
a relevant problem was the need of a good description of the interferometric 
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output; otherwise, the results of all algorithms suffer of additional error caused 
by the lack of similarity between the real signal and the template (correlation 
method) or the filter output (demodulation). 

Methods hke the classical AC or ABCD are robust in this sense, but they im- 
pose strong requirements in terms of normalization, requiring a deep insight 
into the calibration. 

The model we have used till now is based on strong assumptions on the source 
and on the instrument features, such as constant flux over the spectral range 
for the star, constant response function over a rectangular spectral band for 
the latter. If we introduce in the model the possibility of tuning as much as 
possible of these characteristics, wc can reach a better accuracy. 
So, before proceeding, we have to introduce a more realistic and detailed model 
of the interferometric pattern, using more parameters than those used till now, 
separating the contribution of the source, of the atmosphere and of the instru- 
ments. 

The instrument is characterized by a spectral transmission factor p(A), a visi- 
bility Vi{X), a detector quantum efficiency QE{\) and integration time r that 
influence the flux intensity, and it introduces its own phase to the optical path 
4>i{X). The source has its own spectral distribution of the intensity /(A), a 
visibility V^(A) and a magnitude m. The atmosphere effect can be described 
by a factor representing the wavefront degradation i]a{)^) and by reflective in- 
dexes ni(A) that depends from the path in air p of the astronomical beams 
in the delay line, compensating the distance from the zenith of the observing 
direction, and so from the relative position of source and telescope. All these 
parameters, except the source magnitude, are wavelength-dependent, and so 
they influence the effective working wavelength Aq, which is no longer a fixed 
nominal value in the middle of the wavelength range. 

We are now able to describe the monochromatic signal for an interferometric 
channel by a combination of all these parameters: 

/(A,x) = Is{X)-T-p{X)-QE{X)- (3.1) 
• 1 + Va{X) ■ Vi{X) ■ V{X) ■ sin (^0,(A) + ^[n(A) • x + n,(A) • p]^ 

where Is is the effective source flux intensity given by the combination of the 
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emitted flux scaled by the star magnitude and the telescope collecting area, 
and resulting from the beam interference. 

The polychromatic beam is the integration over the spectral range A = [Ai, A2] 
of each monochromatic component: 



In the following paragraphs, we will review the algorithm concept (Sec. I3.2.2p . 
giving an estimate of the expected error, and then we will briefly assess OPD 
and GD performance (Sec. I3.2.3p . A detailed analysis of GD and OPD perfor- 
mance, in different observational and atmospherical situations, and including 
perturbation on instrumental parameters, can be found in [32|. In Sec. I3.3l we 
will first discuss the importance of sensitivity analysis, with a working exam- 
ple, and then (Sec. 13.3. ip we will concentrate on the calibration issue, i.e. the 
derivation of the essential features of the model directly from data, and the 
reconstruction of the interferometric signal. 



In this Section, we describe the VLTI PRIMA Fringe Sensor Unit instrument 
and the location algorithm we proposed, based on the interferometric model 
introduced with eq. 13.11 

3.2.1 Instrument description 

A description of the PRIMA (Phase Referenced Imaging and Micro-arcsecond 
Astrometry) instrument can be found in [20], or at the web site: 
http : / / www. eso .org/ pro j ects /vlti/ instru / prima / index_prima. html. 
Its primary goals are the imaging of faint sources with an high angular resolu- 
tion and a high precision astrometry, at the /xas level. Finally, it also aims at 
increasing the sensitivity in order to be able to reach prohibitive magnitudes. 



For such reasons, a fringe tracking system is mandatory. PRIMA requires two 
identical dedicated fringe sensors, called FSU A and B, tailored for its needs. 
The PRIMA FSU (Fringe Sensor Unit) concept is schematized in figure 13.11 
The working wavelength range is the K band ([1.9 — 2.6] fim). Differently 




(3.2) 



3.2 Algoritm for PRIMA FSU 



till 19. 
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from FINITO, it does not foresee an active modulation of the internal optical 
path, since the estimate of OPD is based on the phase distribution among the 
different polarization and wavelength components. 




\—> . 



Figure 3.1: Schematic of PRIMA FSU design. 

Each FSU is fed by two telescope beams, that pass through an alignment 
system with five degrees of freedom per beam, aimed to correction of differ- 
ent aberration effects: two lateral (decenter) for pupil alignment, two angular 
(tilt) for image alignment, and a longitudinal term for OPD alignment. Their 
control is in charge of a dedicated software module, which convert mechani- 
cal to optical degrees of freedom through a kynematics matrix. The fringes 
are spatially sampled following an ABCD scheme. To implement it, before 
combination one beam is retarded of 7r/4, by an achromatic phase shifter im- 
plemented by a K-prism, then the two beams enter the combiner, a splitting 
cube with nominal transmission and reflection of 50%, then the two combined 
outputs are further split according to the polarization components with two 
independent polarising beam splitters, finally obtaining four beams with a rel- 
ative 7r/4 phase shift: the A, B, C and D beams. Each of them is then injected 
in individual optical fibres, that also act as spatial filters, as we mentioned in 
the case of FINITO. The fibres carry the four beams into the dewar, where 
they are mounted onto a mechanical reference, aligned thanks to four degrees 
of freedom: two lateral, one longitudinal (for image magnification) and one ro- 
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tational. Each beam is collimated and spectrally dispersed by a prism, before 
being focused on a PICNIC detector: most of the flux, about 70%, is retained 
in the central spot, while the remaining flux is split into two side bands and 
focused in neighbour pixels. In total, after each integration period the detector 
gives 12 values, corresponding to three sub-bands for each beam. In another 
way, it gives three sets of ABCD points, one for each spectral band. 



Also PRIMA FSU is monitored by a metrology system, that shares the same 
optical components with astronomical beams, giving an alternative monitoring 
of the optical paths followed by stellar photons. The metrology monitors the 
whole optical path from the FSU combiner up to the telescopes, thus providing 
sensitivity to environmental disturbances which degraded the performance of 
previous instruments. 



We remark some differences with respect to the FINITO design, apart the ob- 
vious mechanical and optical approach, based on the symmetry between the 
two arms. First of all, all photons are retained for the location algorithms, and 
there are no photometric beams such as in FINITO. Then, the fibres enter di- 
rectly into the cryostat, in order to minimize the effect of thermal background, 
which is relevant in K band. This means that their position must remain 
stable. After delivery, this part of the instrument has been upgraded, allow- 
ing motorized control of their positions, to further improve stability. Finally 
the spectral dispersion allows to simultaneously sample the interferogram in 
three contiguous spectral bands. In this way, it is possible to check the OPD 
position with the central pixel values, but also to directly assess the differen- 
tial phase shift among the spectral channels. This allows to have a chromatic 
phase information, which in turn may be used to provide an estimate of the ab- 
solute OPD, removing the fringe period degeneracy, i.e. the group delay (CD). 

The location algorithms proposed for both OPD and CD take advantage from 
the availability of a discrete number of values for each measurement step. They 
are based on the comparison between the measured signals and the global signal 
model, described in eq. 13. using a weighted least square fit. Note that the 
variables of eq. 13.11 are allowed to change between each channel, in order to 
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properly model the overall system, so eq. 13.11 becomes: 

/,(A, x) = Is,iW ■ n ■ p^{X) ■ QE{X) ■ (3.3) 

• 1 + VA,iW ■ Vi,i{\) ■ Vi{X) ■ sin (^(j)i,i{X) + f^HX) ■ X + ni{X) ■ p]^ 

with i = 1, . . . , 12. The choice of the cutoff wavelengths between the three 
sub-bands of K band, that we will hereafter indicate as Kl, K2 and K3, is of 
crucial importance, because all instrumental parameters depend on it. 



3.2.2 Algorithm concept 

The choice of spatial modulation of the fringe pattern over the detector through 
the combination of beams separated by known phase offsets makes available 
four samples in approximate quadrature for each K sub-band. The OPD/GD 
evaluation is based on the comparison of the measured interferometric signal 
with tabulated FSU output, computed for a set of OPD, respectively GD, 
and source, atmosphere and FSU parameters computed by a calibration before 
integrating. The most likely OPD or GD value is identified in the least squares 
sense, by an iterative technique. 

Instead of minimizing the quadratic error between measured and tabulated 
measurements, the algorithm searches the zero of the derivative function 
(Newton-Raphson method of zero-crossing). An initial approximation is needed, 
and from it depends the descent of the error gradient. The minimum search is 
done over the three sub-bands simultaneously. 

The error function e, called hereafter discrepancy, between the measured Sn{z) 
and the tabulated fn{x) values depends on x and z, i.e. the template and the 
unknown OPDs, respectively. Each measurement is affected by an error, rep- 
resented by the random variable e„. We assume that the {en}n<o are mutually 
uncorrelated, and such that 

E[en] = 0, E[el]=al Wn (3.4) 

So the discrepancy has to be weighted to take into account the variability 
of each measure: 

eix,z) = Y,[sniz)Ux)]yal (3.5) 

n 

The subscript n varies over the three sub-bands and over the number of chan- 
nels in each sub-band (4, i.e. the ABCD points). 



VLTI PRIMA FSU 



65 



If we seek an estimator for the z unknown, the natural choice is x. The best 
estimate of the current OPD/GD value is reached when x = z. Thanks to the 
assumption on the measurement errors, provided that the signal model is ade- 
quate, it can be derived ([33], p. 853]) that the estimation obtained minimizing 
eq. 13.51 is unbiased and optimal in the least square sense: 



E\{x-z)\ = 



, -1 

P/ \2 



E[{x - zf] = a\x) = (5^^) (3.6) 



when is the template signal derivative. 

The derivative function in the x variable of the discrepancy is given by: 

h{x, Z) = -J2 = E - fnix)] ■ Qnix) (3.7) 

n n 

where gn{x) = — /^(x)/cr^ is the weight function. The minimum discrepancy 
is reached when h = 0. We separate the template factors of the preceding 
formula from the one containing the measured signals, and we obtain: 

h{x, Z) = ^ Sn{z) ■ gn{x) - E fn{x) ■ fi'„(x) = ^ Sn{z) ■ gn{x) - l{x) (3.8) 
71 n 71 

having defined the bias function l{x) as Xln ' 9n{.x) = — ^„ _ 
The first order approximation with Taylor series in the x point for the discrep- 
ancy is given by: 

h{z, z) = h{x, z) + h\x, z) ■ {z — x) + o{{z — xY) (3.9) 

This formulation gives an iterative procedure for the approximation of the 
estimate x of z; given the Zj-i estimate, the following Zj is: 

, , T.nSn{z)- gn{x)-l{x) 

with 

h'{x, Z)=J2 + E 1'r.iz) - UX)]^ (3.11) 



The useful thing to notice is that the quantities h\x,y), gn{x) and l{x) are 
known once the template fn{x) has been defined in terms of the source and 
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environmental conditions shown in section [XT] before starting the observation, 
so they can be tabulated off-line, with a significant saving of computational 
load for real-time operation. This defines an iterative method for fast on-line 
computation of OPD and GD. 



Derivation of the formula and error estimation 

th 

Let us consider the equation t{x) = 0, and the j approximation Xj of the 
zero crossing abscissa. The Taylor expansion of f{x) in the neighborhood of 
Xj is given by: 

t{x) = t{xj) + {x- Xj)t'{xj) + R2{xj) (3.12) 
where R2{xj) = o{{x — XjY) is the error term. We easily obtain: 



t{x) = ^ x = x,-^-^^^^—^^ (3.13) 



The error done with this truncation is _!(i£2]_5ii£2l_ 

t {Xj) 

If we substitute t{x) with /i(x, z), we can write the error term as: 

where R2{xj) is of order o{{x — xjY). So, the convergence to zero of e{xj) 
depends on the distance z — Xj and from the value of h'{xj, z). We have seen 
in eq. I3.11l that the latter depends on the first and the second derivative of the 
signal model fn{x), we check when this term is zero, to avoid discontinuities. 
From eq. 13.111 we have that h'{xj, z) = if: 

/;(x,)=0, Vn A Sniz) - Uxj) = 0, Vn (3.15) 

or if: 

/;(x,) = 0, Vn A /:(x,) = 0, Vn (3.16) 

In the first case, the formula is exact, so the error term is zero. For the second 
case, we schematically write /«(x), following eq. 13.11 as a sinusoidal function 
fn{x) = An sin(27rx + 0), which leads to the following statements: 



f'{x) = 271 An cos(27rx + 0) 
f"(x) = -47rM„ sin(27ra; + 0) 



(3.17) 
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and they are equal to zero if: 



fix) = 



fix) = 




(3.18) 



so they can not be both zero. So, we are sure that the error function never 
diverges to oo. 

Of course, the magnitude of the error on the estimation depends on the good- 
ness of the approximation x — Xj. 

GD estimation 

For the GD estimation, applying this procedure on the coherence length would 
require a large amount of time, incompatible with the desired high OPD control 
rate. A useful approach is to adapt the described algorithm to a large path, 
i.e. over three or five fringes around the central one. The OPD estimate is a 
good first approximation, since it is a minimum; the search in nearby fringes 
assures that the result is a global minimum, and not a local one, as the OPD 
could be. The estimation error can increase, because xqpd — z at the first 
iteration could be greater than 2n. 

3.2.3 Algorithm performance on OPD and GD 

The algorithm description highlights the need of a signal model, from which 
all the tabulated functions can be derived. In this section, we describe the 
software implementation of this model, and the principal parameters. 

Numerical description of the FSU 

We have already said, in Sec. 13.11 that the parameters that appear in eq. 13.31 
are all wavelength dependent. This fact has suggested us an implementation 
procedure, based on the description of the interferometric model parameters 
as a function of the wavelength. Moreover, a version of each function can be 
tailored on the characteristic of the single channels. 

From this description, it is possible to write the monochromatic interferogram 
at wavelength A for a selected channel, over an assigned OPD scan, just se- 
lecting the corresponding values, and substituting them in the eq. 13.11 
The polychromatic interferogram of each channel is now simply the sum of the 
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corresponding monochromatic ones. 

It is evident that this approach allows to manage the spectral description of 
the channel very easily, because it construct the polychromatic interferogram 
simply as a sum. 

Instrumental parameters can be calibrated with laboratory test, starting from 
the reference description given by the constructors of each component. We now 
describe how we modeled the source, the background noise and the atmospheric 
turbulence. 



Source spectral description 

The model we have chosen for the source is the black body at a given tem- 
perature. The black body is a theoretical object that absorbs all the radiation 
that hits it, without any reflections. It is also a perfect emitter of radiation, 
at all wavelengths because it has to be able to absorb at every wavelength. 
Even if in practice no material has been found to be a perfect blackbody, it is 
a convenient model because it is defined at all wavelengths. As the tempera- 
ture decreases, the peak of the radiation curve moves to lower intensities and 
longer wavelength. Figure 13.21 shows the normalized blackbody intensity for 
three temperatures. 
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Background noise 

The background noise, i.e. the radiation emitted by the environment on which 
the instrument is immersed, can be modeled as a blackbody at environmental 
temperature (300 K). It emits in the near infrared, and its spectral distribution 
is depicted in figure 13.31 

Normalized blackbody emiBsion at environmental temperature 

0.2 I 1 1 1 1 I I 



Figure 3.3: Normalized blackbody radiation for background noise [300 K]. 



Atmospheric noise 

The atmospheric noise can be modeled as in chapter [21 Section B.2.2[ We recall 
here that it can be described with its power spectral density, proportional to 
/~3, where / is the frequency, and it depends on atmospheric parameters, 
such as the wind speed and the Fried parameter; its phase is assumed to vary 
randomly with a uniform distribution in the range [— 7r,7r]. The parameters 
depends on the geographical location, and on weather conditions. 



OPD performance 

The nominal OPD performance is limited by the expected noise on the signal 
with respect to the template one. Additionally, we can expect systematic errors 
due to model limitations and non-linearity of the interferometric process. To 
assess the OPD performance, we look at different parameters: the discrepancy 
between the input OPD and the evaluated OPD, the linearity of the correlation 
of the two OPDs, and the number of fringe jumps, i.e. the percentage of choice 
of a lateral maximum instead of the correct maximum in the central fringe. 
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We model the celestial source as a black body with effective temperature of 
3500K, that is, a quite faint source. Fig. 13. 4[ shows the spectral distribution 
of the source spectrum, split into the three sub-bands Kl, K2 and K3. We set 
the model parameters to their nominal values; they will be described in more 
detail in Sec. 13.3.11 In particular, the relative fluxes in the three sub-bands 
are scaled accordingly to realistic values for the transmission and phase of the 
FSU, of the VLTI, to take into account the fluctuations sources outside the 
fringe sensor, and for the quantum efficiency of the detector. The background 
noise is distributed as a blackbody at 300K, but it is also split following the 
spectral division between K subbands (13.51 right). 



1 




Wavelength [\Lm] 

Figure 3.4: Source spectral division in the three K bands. 




Wavelength [|im] 

Figure 3.5: Background radiation in the three K sub-bands. 



We express the signal conditions in terms of the Signal to Noise Ratio (hereafter 
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SNR). We define it approximately as: 



SNR 



N,*QE 



(3.19) 



^/Ns *QE + ROm 



where A^^ is the total received flux, comprehensive of the visibility and of in- 
strumental factors, while QE is the quantum efficiency. For the noise term, 
we consider the photonic and the readout (RON) noises. 

We consider a set of observational situations, with sources at different mag- 
nitudes and integration time, which leads to different SNR. The described 
situation are quite critical, for short integration time and limiting magnitudes. 
They are listed in table 13.11 for decreasing SNR values. 



The standard deviation of the discrepancy between the nominal and the eval- 
uated OPD and the linearity between the two are reported in figure 13.61 while 
fig. 13.71 shows the percentage of fringe jumps. In all picture, the red dotted 
line corresponds to limiting performance requirements given by ESO. We can 
see that the algorithm gives good results in terms of both evaluation error and 
linearity, while the fringe jumps are a more critical issue. 



mag integration time (ms) RON (electron) 



SNR 
352.5201 
192.8585 
45.7332 
19.9307 
23.4404 
23.0549 
18.1428 
10.3002 
7.2743 
5.1856 
3.8227 
2.3320 
2.2520 
0.9327 



7 0.25 11 

8 0.25 11 

10 0.25 11 

11 0.25 11 

13 2 11 
19 10000 4 

14 2 4 
19 2000 4 
19 1000 4 

16 4 4 

17 10 4 

18 20 4 

19 100 4 
19 20 4 



Table 3.1: SNR vs. magnitude and exposure time. 
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standard Deviation of discrepancy versus SNB 



;arily of correlation of inpufcutpul OPD versus SNR 



45 50 



Figure 3.6: Standard deviation of the discrepancy between nominal and 
evaluated OPD (left) and linearity between the two measures (right). 



ParcenUga o1 fringe jumps versus SNB 



20 40 60 



130 140 160 180 200 



Figure 3.7: Percentage of fringe jumps at different SNR. 



GD performance 

In this section, we report the GD performance in different observational sit- 
uations, with a range of values for integration time, magnitude and readout 
noise. The simulated source is again a blackbody with temperature 3500 K, so 
a rather red source. Results are reported in table Its analysis reveals that 
the algorithm performance is good. However, the integration time is an impor- 
tant variable: for faint magnitudes, it can affect the performance dramatically 
(fourth row in the table). This imposes some constraints on the GD evalua- 
tion: while the OPD can be estimate at high rate, for a good GD estimation 
a longer integration time is required. This fact have impact on the software 
development of the detector readout. Thus, the side spectral band pixels can 
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case 


magnitude 


Integration time 


KUiN 


a 


iU 


c 



O.i 


u 


1 9 


c; 
o 




C 


14 


200 


0.8 


d 


16 


200 


0.8 


f 


16 


2000 


0.3 



Table 3.2: Description of GD observational situations. 



case 


GD noise sd[nm] -req 


GD noise sd [nm] 


fj (%) req. 


fj (%) 


1 req. 


1 


a 


900 


5.394 


1% 


0% 


1 ± 0.1 


1.0001 


b 


3300 


138.38 


1% 


1% 


1 ± 0.1 


1.0011 


c 


800 


7.302 


1% 


0% 


1 ± 0.1 


1 


d 


1900 


1071.79 


1% 


59% 


1 ± 0.1 


1.019 


e 


600 


11.39 


1% 


0% 


1 ± 0.1 


1.004 



Table 3.3: GD performance, in terms of standard deviation of GD noise 
(GD noise sd), percentage of fringe jumps (fj) and linearity coefficient (1). 
For each of them, the limit requirements are reported (req.) 



be read at the lower GD rate, whilst the central band ("white light") pixels 
are read at the faster OPD rate. 

As for the OPD case, we resume the GD algorithm performance in terms 
of the standard deviation of the discrepancy between the nominal and the 
evaluated GD, and of the percentage of fringe jumps (reported in figure [33]). 
All quantities are given as a function of the magnitude of the stellar source 
and of the integration time. The limiting performance required by ESO is 
represented as red dots. We can notice that the integration time is a crucial 
variable for the algorithm performance, especially for high magnitudes. 
Finally, we show in fig. I3.9l the linearity between the nominal and the evaluated 
GD. We find again that the performance is good till the limiting magnitude 
m = 19. 

3.3 Calibration and Sensitivity analysis 

The proposed description of the interferometric signal requires a good knowl- 
edge of a lot of parameters. The uncertainty on them causes the error on the 
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screpancy versus magnitude 



15 16 17 



Percentage of fringe jumps versus magnitude 



+ effective 
5% threshold 



Ht-|TI = 1 m 



^^ |TI = 5ms I ^^ |ti = 5ms\ +b 



3 



ID 11 12 13 14 15 16 17 18 
magnitude 



Figure 3.8: Standard deviation of the discrepancy between nominal and 
evaluated GD (left) and number of fringe jumps (right). 



input/output OPD versus magnil 



17 18 19 20 



Figure 3.9: Linearity between nominal and evaluated GD. 



OPD and GD to grow. 

In the described implementation of the FSU model, the needed parameters can 
be divided in four categories, resumed in the following list. 
Source parameters: 

• effective temperature 

• flux at zero magnitude 

• effective magnitude 
Atmospheric parameters: 
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• refractive index of air (in laboratory or in site) 

• unbalanced air path (in laboratory or in site) 
Instrumental parameters: 

• phase and transmission of the VLTI before the FSU 

• visibility of the VLTI before the FSU 

• phase and transmission of the FSU 

• visibility of the FSU 

• thermal background at the input of the FSU 

• detector quantum efficiency, and conversion factor between photons and 
photo-electrons at the output of the detector 

• cutoff wavelengths between FSU bands 

• wavelength array for K band: [1.9 — 2.6] /im 
Observation-depending parameters: 

• pointing parameters, such as air path 

• integration time 

• detector read-out noise (changing as a function of the readout mode) 

The knowledge of all these parameters with a sufficient accuracy is of crucial 
importance. A discrepancy from the nominal value can cause perturbations 

over other model parameters. The study of the possible effects of parameters 
variation is called sensitivity analysis. We give an example of it, investigating 
the effect on the working wavelength of a misalignment of the fibres with 
respect to their nominal position. 
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Fibers displacement 

The nominal position of the fibres carrying the FSU outputs to the detector 
are supposed known. A perturbation of their alignment modifies the energy 
distribution on the detector, modifying the intensity on each spectral channel, 
but also the effective wavelength. This effect could induce an apparent phase 
shift. Due to spectral dispersion over at least three pixels, a displacement 
along the dispersion direction (say x) has more impact than a perpendicular 
perturbation (say y), modifying the effective wavelengths in each band. The 
simulation of the imaging quality on the detector (summarized in the point 
spread function) is done by Code V, a ray-tracing software. The model can 
be tailored on the real optical system. To obtain the overall PSF, the point 
spread functions at each wavelength are summed together. The charge diffu- 
sion on each pixel is not uniform, but is modeled by a Gaussian distribution, 
whose parameters are determined by the physical instrument. Also, the pixel 
response distribution is modeled, from literature data, to account for the lower 
sensitivity close to the pixel edges. 

Figures 13.101 and 13.111 show the modification on the total fiuxes in each K 
subband and the variation of their effective wavelengths, respectively, when 
we simulate a fibre movement in the x and y directions. For the y direction, 
we can notice that the effects are not negligible only for large displacement, 
comparable with half the pixel size, i.e. the sensitivity to this perturbation is 
low. 

Percenlage of lluxes in K band Fluxes (%] in K band 



y displacement [um 



Figure 3.10: Flux intensities variations when a displacement of the fibres 
is applied on the x direction (left) and on the y direction (right). 



The estimated effective wavelength ranges on an interval around the nom- 
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Figure 3.11: Effective waveiengtli perturbation when a displacement of tlie 
fibres is applied on the x direction (first row) and on the y direction (second 
row). 



inal one, i.e. at zero displacement. We evaluate the width of this interval, in 
terms of percentage of the nominal wavelength, and we list the results in table 

X displac y displac 

Kl [2.07/im - 0.91%, 2.07/im + 1.56%] [2.07/im + 0.73%, 2.07/im + 0.34%] 

K2 [2.25/xm - 2.44%, 2.25/im + 2.45%] [2.25/im - 0.22%, 2.25/im + 0.32%] 

K3 [2.42/im - 1.12%, 2.42/im + 0.55%] [2.42^m - 1.05%, 2.42/im - 0.38%] 

Table 3.4: Effective wavelengths variations. 



3.3.1 Calibration 

The example we have illustrate in the previous paragraph shows the impor- 
tance of the calibration. Moreover, due to the properties of the PRIMA FSU 
location algorithm, the calibration is a fundamental issue for OPD and GD 
performance, because it provides the estimates of the current value of all the 
source and environmental parameters needed for the definition of the tabulated 
templates. 

Calibration strategy 

The calibration strategy proposed for PRIMA FSU must be implemented be- 
fore an observation, is performed with laboratory sources (such as blackbodies 
and lasers) and can be described by the following steps: 
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1. evaluation of the values of the previous parameters, with the exception 
of the phase and the transmission of the FSU 

2. definition of the parameters related to the OPD scan 

3. Fourier transform of the FSU signal in the wavelength space, for the 
estimation of the overall phase and transmission 

4. estimation of the effective magnitude 

5. FSU template signal build-up 

For the first step, dedicated measurements have to be performed, as for the 
instrumental parameters, whereas source and atmospheric parameters depend 
on the chosen observational target, i.e. its magnitude and position in the sky, 
as well as observation- dependent ones. 

The second step defines the operational range of the FSU template in terms 
of OPD scan and its conjugated quantity, the wavenumber range. The range 
and sampling of the OPD scan must take into account the different coher- 
ence lengths of the three sub-bands, characterized by different spectral range 
and central wavelength, and the recovery of the FSU phase and transmission 
through the Fourier Transform at step 3. 

From data effectively acquired in a calibration run it is possible to retrieve 
the current FSU transmission and phase, as the modulus and the phase of the 
Fourier Transform of measured data in the wavenumber space. These mea- 
sures are directly linked with the complex visibility. The interferogram is the 
real part of the complex visibility, and being polychromatic is the sum over 
all working wavelengths. With a FT of the interferogram, we can separate the 
different components. 

Although in simulation the OPD scan is performed with equally spaced step, 
when working with real data this is no longer true. The metrology system pro- 
vides the actual value of the instrumental OPD, with an accuracy of order of 
nanometers, but the different step size have to be taken into account. This is 
done using the Discrete Fourier Transform instead of the FFT over the desired 
wavelength range. 

It must be considered that these values implicitly depend on all the parameters 
we listed before. Laboratory tests can isolate the FSU phase and transmission 
values, then they must be scaled with terms depending by external factors, 
such as the phase and transmission of the VLTI at the input of the FSU, the 
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environmental conditions such as background noise, and so on. 

The estimation of the magnitude of the emitting source is a difficult task, be- 
cause it is corrupted by several other terms, such as visibility, noise sources, 
atmospheric conditions and so on. The effective magnitude, hovewer, can be 
evaluated with a comparison with a set of sources at different magnitudes, but 
all in the same operational conditions, and searching the magnitude that best 
fit the measured one. 

The FSU signal at step 5 is constructed starting from the flux emitted by an 
ideal source, modeled as a blackbody at the given temperature, scaled for the 
source magnitude and the instrumental parameters. The interferometric out- 
put is designed as in equation 13.31 with the phase contribution of the FSU, the 
VLTI and the known offset between A, B, C and D channels, and the estimated 
contribution of the air path and refractive indexes. Finally, noise sources are 
introduced. The thermal background noise is modeled as a blackbody source, 
too, at a temperature of 300 K, see flgure 13.31 Indeed, detector and photonic 
effects are modelled as random uncorrelated variables. 

Simulation parameters 

We now list the relevant parameters used in the simulations, with their nominal 
values: 

• source effective temperature: 3500K 

• flux at zero magnitude: 3.78e7 ph/msec, effective magnitude: 8 

• The refractive index of air is evaluated following Daigne & Lestrade ([2Z] 
and references herein) as: 

n{a) = 1 + a + + 70-^ 

where a = 199.329e — 6, (5 = 1.129e — 6 and 7 = 9e — 9 are air index 
parameters, and their values have been measured at VLTI 

• atmospheric transmission: modelled as in flgure 13.121 

• unbalanced air path: m 
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wavelenglhs [urn] 



Figure 3.12: Atmospheric transmission as a fmiction of the wavelengths. 



• phase and transmission of the VLTI: nominal values (0 and 1, respec- 
tively, for all wavelengths) 



• average visibility of the VLTI: 0.75 



• phase and transmission of the FSU: the nominal values are reported in 
table [S3] 



A 


1.90 


2.00 


2.10 


2.20 


2.30 


2.40 


2.50 


2.60 


FSU tr A-C 


48.8 


48.8 


51.3 


50.2 


45.5 


42.9 


33.8 


33.8 


FSU tr B-D 


45.4 


45.4 


47.4 


47.6 


43.9 


40.6 


34.0 


34.0 


FSU ph A 


2.35 


2.35 


-5.63 


2.62 


0.23 


7.69 


-2.98 


-2.98 


FSU ph B 


89.26 


89.26 


89.52 


89.80 


90.09 


90.40 


90.72 


90.72 


FSU ph C 


175.35 


175.35 


184.17 


176.82 


180.15 


173.69 


185.43 


185.43 


FSU ph D 


269.26 


269.26 


269.52 


269.80 


270.09 


270.40 


270.72 


270.72 



Table 3.5: Nominal transmission and phase values for the four channels A, 
B, C and D of the FSU. Channels A and C, and B and D share the same 
transmission, respectively, because they are separated before the detection. 
The nominal phase is around the phase angle 0, 7r/2, tt and 37r/2, respectively. 



• visibility of the FSU: function of the wavelengths. Its nominal values are 
reported in table 13.61 
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A 1.90 2.00 2.10 2.20 2.30 2.40 2.50 2.60 

FSU visibility 0.934 0.938 0.942 0.946 0.950 0.953 0.955 0.955 

Table 3.6: Nominal instrumental visibility, near 1. 

• thermal background at the input of the FSU: modelled as a black body 
at temperature: 300K (27C) and with an amplitude of 1.22e5 ph/msec 

• detector quantum efficiency: a decreasing function of the wavelength A, 
tabulated in table 13. 7[ 

~X 1.90 2.00 2.10 2.20 2.30 2.40 2.50 2.60 
QE 0.65 0.65 0.62 0.62 0.61 0.58 0.45 0.03 

Table 3.7: Detector quantum efficiency spectral distribution. 



• wavelength array for K band: [1.9 — 2.6]/im, with sampling at 0.05/im 

• integration time: 1 msec 

• OPD range: [—50, 50]/im, with a step of 0.2/im, for a total of 501 points 
Results of simulation 

We simulate a noisy signal, adding a perturbation on the interferogram inten- 
sity, due to photonic noise and to detector read-out noise, with mean fip^ = 20 
ph/msec and variance cx^ = 20^. Both noise sources are modelled as normal 
random variables, thanks to the approximation of the Poisson distribution with 
the normal one at these flux levels. Figure 13.131 shows the results, while table 
13.81 displays the effective wavelengths, evaluated with a weighted mean of the 
transmission functions. The differences between phase channels are small, but 
not zero. 

3.3.2 Calibration of laboratory data 

During the development of PRIMA FSU at the Alenia-Alcatel, now Thales, 
laboratories, several tests were performed with laser or white light input sources 
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phase A phase B phase C phase D 



Kl 
K2 
K3 



2.099 2.096 2.100 2.097 
2.279 2.281 2.283 2.277 
2.440 2.442 2.452 2.450 



Table 3.8: Effective wavelengtfis for the tliree sub-bands and the three 
phase channels, when there is noise on the interferogram. 

at different temperature and so with different spectra. The cahbration pro- 
cedure has been tested with these data sets. We choose a data set as repre- 
sentative of the process. We recorded it on December 2005, the 14th. Data 
are organized as a text file with a matrix containing the recorded values for 
the twelve channels (A, B, C and D phase for Kl, K2 and K3 bands) over an 
OPD scan of about 80 /im, the record time and the OPD position sent by the 
software. 

The OPD is sampled over the range [—29.878, 49.896] /im for a total of A = 267 
points. The mean step is 0.2999 /xm, with a variance of 5.12 ■ 10~^ fim. Fig. 
13.141 shows the outputs of the channels. Each figure contains four signals, cor- 
responding to the four phases A, B, C and D for a single band. The phase 
offset between the signals is nominally of 90 degrees; the zoom of K2 signals 
shows a quite good phase opposition between corresponding outputs (A - C 
and B D). 

The laboratory temperature of the source is estimated in 800C, the integration 
time is set to 1 second. The flux at the reference magnitude is 3.78 ■ 10^ 
photons/sec. 

Evaluation of the overall phase and transmission functions 

The transmission and phase of the instrument are evaluated through the Dis- 
crete Fourier Transform. The wavelength range is [1.9, 2.6] /im, with a step of 
0.01 /im. The differential phase is of interest, i.e. the relative offset between 
channels. In the simulation, phase A is chosen as reference one. The analysis 
of the phase results, plotted in figure 13.151 shows immediately that phase D 
and phase B are exchanged. This fact was due to a wrong pixel indexing dur- 
ing the reading of the detector, and it was corrected. 



After the right indexing of the phases, the transmission and the phase curves 
have the pattern shown in figure 13.161 to be compared with the nominal ex- 



VLTI PRIMA FSU 



83 



ample of figure I3.13[ It can be noted that the flux is not equally separated 
between different sub-bands, but there is a redistribution of it. The pattern 
of the transmission function is also different from the nominal one, especially 
for the side bands, that are more sensible to small misalignment of the spot 
on the detector. This has a weight on the effective wavelengths of the chan- 
nels, reported in table 13.91 There are some differences depending on the phase 
channel, and in particular we can notice that in general band K3 has a lower 
wavelength than the nominal situation. This can mean that part of the flux 
of K2 has migrated into K3 pixels, or that flux in K3 was lower than expected 
from the instrument transmission, lowering the weight of longer wavelengths. 
The phase functions shows a stability over the phase channels, even if there 
are some discrepancies between the nominal values (0, 90, 180, 270 degrees). 





phase A 


phase B 


phase C 


phase I 


Kl 


2.099 


2.088 


2.101 


2.128 


K2 


2.279 


2.240 


2.296 


2.322 


K3 


2.412 


2.402 


2.426 


2.422 



Table 3.9: Effective wavelengths for the three sub-bands and the three 
phase channels. 



Evaluation of the visibility 

The evaluation of the overall visibility can be performed in several ways. Being 
in the quadrature case, so with the A, B, C and D outputs, we can use the 
standard ABCD method, seen in chapter [21 but here the lack of normalization 
of the beams can not be resolved through photometry, and it causes the vis- 
ibility to be corrupted by the envelope shape. The ABCD visibilities in one 
side subband (Kl) and in the central K2 are reported in figure 131171 One way 
is to search the maximum of the visibility function. 

Another way is to evaluate the modulo of the complex visibility function, and 
from theory we know that it is equal to the Michelson visibility [Ij: 

VImax Imin /q r)^^ 

M = J —J [6.ZU) 

where I is the intensity in a channel. With this approach, we find a visibility 
value for each sub-band and each phase channel, and we can interpolate them 
over the wavelength range. The values found for each channel are reported in 
table 
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phase A phase B phase C phase D 



Kl 
K2 
K3 



0.891 0.789 0.920 0.895 
0.884 0.916 0.829 0.879 
0.794 0.840 0.740 0.640 



Table 3.10: Michelson visibilities from calibration data 



Evaluation of the magnitude 

With the values of the transmission function, of the phase (section I3.3.2p and 
of the overall visibility (section [3.3.21) . it is possible to evaluate the magnitude 
of the source. The followed method foresees the generation of a set of tem- 
plate with different magnitudes, but with the evaluated values of transmission, 
phase, visibility, and leaving all other parameters unchanged. The flux level 
of the real data is then compared with the template ones, and the estimated 
magnitude is interpolated from the flux curve (see figure 137181) . The value is 
9.11. 

Signals reconstruction 

With the information collected before, it is possible to reconstruct the signals, 
following equation 13. 1[ The monochromatic interferograms are computed, and 
then added together to form the final polychromatic signal. Figure 13. 191 shows 
both the measured (blue) and the reconstructed signals. It must be noted 
the perfect similarity of fringe separation, while the flux level needs further 
adjustments. This is true especially for band Kl and K3 (see figure 13.201 for 
a zoom), for which the flux is weaker, while in K2 the reconstruction is very 
faithful. 

3.3.3 Conclusions 

In this chapter, we have analyzed a new approach to the problem of estimat- 
ing the current position of the fringe packet with respect to the optical path 
difference scan, based on an accurate modeling of the interferometric signal. 
We have found good simulated results, but we have also highlighted the need 
of a precise calibration procedure. We could reproduce very well the spectral 
behaviour of the measured signal, and this is an important goal. However, 
we pointed out that there are still some uncertainty on the evaluation of the 
signal intensities magnitude. 
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Figure 3.13: Transmission (four top) and phase (four bottom) evaluated 
by means of the calibration. 
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Figure 3.14: First row, raw signals in lateral bands Kl (left) and K3 (right); 
second row, raw signals in K2, the wider band, and a zoom of the area near 
to the maximum of intensity. Note that there is a good phase opposition 
between A and C, and B and D, and that the maximum intensity is not at 
the zero OPD, meaning that there is an offset of about lOfim between arms 
of both combiners (A and C and B and D) 
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Figure 3.15: Instrumental differential phase of the four phase channels A, 
B, C and D; the pink line is the nominal value. The blue line is the evaluated 
phase for Kl, the green one for K2 and the red one for K3. 
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Figure 3.16: At the top, corrected instrumental differential phase of the 
four phase channels A, B, C and D; the pink line is the nominal value. The 
blue line is the evaluated phase for Kl, the green one for K2 and the red one 
for K3; at the bottom, transmission functions. Each picture groups together 
the transmission curves of each phase channel. 
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Figure 3.17: ABCD visibility curves for Kl (left) and K2 (right). It is 
evident the effect of the envelope and of the modulation. 
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Figure 3.18: Total flux in K band and the four phases for different mag- 
nitudes. The current magnitude is evaluated interpolating the total sum of 
the data flux over this curve. 
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Figure 3.19: From top to bottom, measured (blue) and reconstructed (red) 
interferogram for Kl, K2 and K3. 
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Figure 3.20: Zooms of Kl (first row) and K3 (last row) signals. 
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Chapter 4 



Statistical analysis of 
interferometric data 



In this chapter, we will analyze typical interferometric data with classical sta- 
tistical techniques, both in the time and in the frequency domain. Then we 
will select some other statistical instruments, such as tests on variance and 
multiple regression analysis, to properly identify features of interferometric 
signals, in particular the variability. 



4.1 Introduction 



In the first part of this thesis, we have described the physical and mathematical 
framework of interferometry, and used different models in the algorithms for 
estimating the fringe location with respect to the optical path. We have used 
the ideal model (see chap. [H par. 11.3.21) : 



lip, A) 



(/i + h 



h + h 



.271 , 

sin[ —p) 



A 



Hp) 



A2 



Hp, A) 



np 

1 + sine-— cos 



27ip 

"a7 



(4.1) 



where I{p, A) and I{p) are the monochromatic and the polychromatic inter- 
ferogram, respectively, II and 12 are the intensities of the signal from each 
telescope, p is the differential optical path (OPD) between the two incom- 
ing beams, Au is the spatial frequency range, linked to the wavelength range 
[Ai, A2], rjo is the constant instrument response, and Lq is the coherence length. 
We have seen that, when working with real data, in real context, this model 
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is not sufficient (chap. [2D- Many sources of noise add to the real signal, such 
as atmospherical turbulence, that we have described from theoretical model 
(see chap. [U, par. 11.4.21) . Their residuals fluctuations after the adaptive optics 
correction, plus instrumental noises, can be described in their simpler form 
by gaussian processes. Also instrumental characteristics, such as transmission 
and phase, can be properly described as spectral distributions, as they are not 
constant over the wavelength range. 

To include such contributions in the signal model, it is necessary to add new 
variables to the model: 



I{p) ~ pi 



1 + if ■ rf ■ T]^ ■ Vsin + — [ra-j9+(n — uq) ■ pA\ ] 



(4.2) 



where / the intensity of the incoming signal, p is the transmission factor, Vj^ 
the instrumental visibility, associated to the photometric unbalance of chan- 
nels, t]^ the source spectral distribution, 77^ the wavefront degradation, V the 
source visibility, (p the instrumental phase, n = n{X) the air refraction index 
and Pa the optical path in air. 



All these parameters have to be measured, and their fluctuations have a con- 
sequence on the system stability, as we said in chap. [3l Even if this model can 
give good results, there are still some discrepancies between measured and re- 
constructed signals, or degeneracy among parameters increasing the difficulty 
of correct estimates. We can expect some residual fluctuations after the wave- 
front correction done by adaptive optics, or higher order interactions between 
the two beams. There are still some efforts to do to describe correctly the in- 
coming beams, and to understand if some features (flux intensities variations, 
or spectral characteristics, and so on) are systematic or random, in order to 
properly model them in 14. 2[ 

Is it possible to deflne a more manageable equivalent signal model, together 
with a set of diagnostics and estimate algorithms? And if it is the case, is it 
possible to use them for several different data, in order to compare the results? 

If we read the values given by a detector after each integration time as a time 
series, the ideal approach would be to have a mathematical model containing 
the signal static features, as eqs. 14.11 and 14.21 the noise sources suggested by 
experimental and theoretical evidence, their correlations and temporal evolu- 
tion: in a word, a stochastic equation. The difficulties, however, are great. 
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In literature it is possible to find statistical distributions for photometric sig- 
nals in different conditions (see, for example, the treatment of Goodman 
ch. 4]). But when the beams get combined, it is not clear which is the most 
convenient description of what happens inside the instrument. Even in the 
case we could assign a distribution to some variables (for example, noise), it is 
not easy to identify their evolution in the final interferogram, because we just 
have a deterministic description of the physical phenomenon, and we have no 
simple way of propagating the random process distributions. 

An alternative solution is to use statistical methods. First of all, we have to 
restrict the field of research, i.e. the questions that could benefit from an anal- 
ysis of this type, then to properly model the physics involved and to find a 
related sufficient amount of data to have statistical significance, and finally to 
select the appropriate statistical tools. 

In this work, we will focus on the analysis of astronomical beams before and 
after the combination, in order to determine features maintained or changed 
in the interference. For this, we have selected data coming from the VLTI 
commissioning instrument, VINCI, for several reasons, such as the fact that 
both photometric and interferometric outputs are available, for large data sets. 
The description of VINCI and of its data is the subject of par. 14.21 We use 
the classical instruments of the statistical analysis, both in the time (par. 14.30 
and in the frequency (par. 14.41) domain. 

We then focus on the variability of all these signals, and we use more sophisti- 
cated tools to follow the time evolution of this variability (par. 14.61) . We also 
try to identify the contribution of the combination system to the variability of 
the output beams. For this scope, we retain the simpler model of eq. 14. ![ and 
we use the regression analysis and its tests (par. 14.81) . 

Finally, we list in par. l4.1Ul some questions that arose throughout this analysis, 
and that could also benefit from statistical analysis. 

We have encountered several difficulties while applying the described tools to 
our data. First of all, sampled data values are integers, and this poses some 
problems while using tests based on normal distribution, which is continuous. 
Then, features of the signals, such as time-varying trends, required some care 
even in applying estimators of functions, like covariances and correlations, that 
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are well known and deeply studied in literature. Finally, stationarity of time 
series is an important property to validate the results of statistical and prob- 
abilistic approach, but we gathered evidence that we are handling time series 
that are not stationary, not even weakly. 



4.2 Instrument and data description 

The analysis has been performed using sets of measurements acquired by the 
VLTI VINCI (VLT INterferometer Commissioning Instrument) instrument, 
working in K band ([2.0 — 2.5] fim). This choice was justified by the char- 
acteristic of VINCI and of its data. The principal request on the data is to 
have both photometric and interferometric signals, recorded synchronously, in 
order to be able to link photometric level at a given time to the corresponding 
interferometric signal. 

This is possible also with other instruments, such as FINITO in combination 
with scientific instruments like AMBER, that are now working with real data 
(see par. 11.4. ll of chapterU]). The VINCI data were preferable for the availabil- 
ity of a large set of homogeneous data, collected in a comparably long period, 
since the instrument was used since the beginning of this century. 
The amount of data allows a statistical analysis with some confidence in the 
validity of the results. 

A detailed description of the VINCI instrument can be found in literature [I5] . 
Here we give a summary of its principal characteristics. The stellar beams col- 
lected by two telescopes are set in nominal phase by the delay lines, then they 
enter the instrument and are injected into optical fibres, that bring them into 
the core box, called MONA. The beams first enter two beam separators that 
send half of each beam directly to the detector for photometry, while the other 
half are sent to a common coupler, where they can interfere thanks to the 
electric fields superposition within the coupler. The OPD scan is modulated 
by a mirror mounted on a piezoelectric translator. The coupler provides two 
complementary outputs, containing the interferometric modulation and that 
are sent to the detector. Only the four illuminated pixels on the detector are 
read, to increase the readout rate. 

While the OPD is modulated on a complete scan of the coherence length, the 
detector is read at a frequency up to few kHz. This procedure allows to have 
time series of modulated interferometric pattern together with the correspon- 
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dent photometric time series. 

We have to highlight a peculiarity: the sampling of the photometry and the 
interferometric outputs is synchronous, but the photometry is just one half of 
the incoming beam, and it is not the half that contributes to the interferomet- 
ric output. In the following analysis, we will assume that the two halfs (the one 
sent to the detector and the one sent to the interferometric coupler) are equal 
and are subject to the same amount of noise. It is a reasonable hypothesis, 
because the beams are traveling into optical fibres, and not in air. 

Each OPD scan provides a record of data, read from the detector, composed by 
four time series, two for the photometry and two for the interferometry. The 
flux intensities are given in ADU (Analog Digital Unit) and they are integers. 
The scans are repeated, and a number of records is stored. 
Each observation provides four different sets of data. The first three sets are 
for calibration purposes. The fourth contains the results of the interferometric 
observation. We will hereafter refer to these different sets as case 1 to 4: 

case 1. Off source. The two arms of the beam combiner are opened, but not fed 
by source flux. It is a record of the noise level and noise propagation 
inside the instrument. 

case 2. One arm of the combiner (arm A) is fed with stellar source, while the 
other is closed; it still contains background noise. It is useful to check 
feature of the single arm. 

case 3. The same as case 2, but specular with arm B. 

case 4. On source: both arms are fed with stellar source, so the interferometric 
combination is possible. 

Each case provides the recording of the four pixels (two for the photometry 
and two for the interferometry), for a number of complete OPD scans (100 for 
case from 1 to 3, and 500 for case 4). 

The data set analyzed is based on an observation done on July 15, 2002. The 
reference target was O Centauri, while the scientiflc star was a Centauri A. 
Both stars are bright, but G Centauri is smaller than a Centauri A, so its 
visibility is higher. The data set, used to describe the VINCI data processing 
[T2j, is part of a series of observations used for the determination of the angular 
diameter of a Centauri A [3l] . We choose the reference star, and we retrieved 
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the data from the ESO archive (http:/ / archive. eso.org). 

The integration time was set at 0.364 seconds, at a piezo frequency of 650 
fim/sec, which gives a scan of 236.6 fim. Each scan contains 526 points, so the 
step is 0.45 fim. The observation is done in the K spectral band ([2.0—2.5] yum), 
this means that each fringe contains ~ 5 points. 

Data are counts of photons occurrences (ADU). This means that, as said be- 
fore, they are integer numbers. 

Picture 14.11 shows two typical records of VINCI data. It is possible to recog- 
nize the two photometric inputs, in pink and green, and the two interferometric 
outputs, in blue and red. Hereafter, we will refer at them as PA, PB for the 
photometric signals and II and 12 for the interferometric ones. 



Raw data, record number 5 Raw data, record number 16 




opd [[im] opd [nm] 



Figure 4.1: Raw data, record 5 and 16 



Data immediately show a specific feature, i.e. the presence of a trend chang- 
ing over time. This trend makes clear the correlation between photometric 
and interferometric channels, but it can cause features on the autocorrelation 
function and on the spectrum. 

4.3 Statistical analysis in the time domain 

Even if we could have some theoretical information about the behaviour of 
photometric signals in very controlled conditions |i23i, ch. 4], we still need tech- 
niques of statistical inference to properly characterize the time series we are 
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dealing with. We have to face the problem of estimating unknown quantities, 
such as correlation functions, directly from the data. In the next paragraphs, 
we will adapt classic statistical instruments to the particular features of VLTI 
signals. 



4.3.1 Statistical methods description 

Since we do not have any a priori information on the relevant data features, 
such as mean values or variances, our work is based on the data analysis. The 
'unit' samples set is the record, i.e. 526 points corresponding to a single OPD 
scan. 

The presence of a non negligible trend varying over time imposes some consid- 
erations on the sample mean definition. It is useless to consider a global mean 
over the record, because it can be very different from 'local' mean. Instead of 
subtracting a mean, we subtract the linear trend, evaluated over sub-intervals 
on the record. This approach permits to obtain a zero mean signal. The sub- 
tracted signal is an essential feature of the beams; being a time-variable trend, 
it can not be considered a seasonal trend, but a characteristic of the time 
series. The effect of the subtraction of these two different means (the global 
and the local one) on the correlation between beams are analyzed in the next 
paragraphs. Moreover, the evaluation of the mean over different subintervals 
changes the number of degrees of freedom in the estimators. 
The detrend operation is done through the Matlab detrend operation (see 
help page at http://www.mathworks.com/support/functions/alpha_list.html). 
A continuous, piecewise linear trend is subtracted, using set of user-defined 
breakpoints. The coefficients of the piecewise polynomial are computed with 
a least squares fit. 

With these considerations in mind, we estimate autocovariances and autocor- 
relations of single detrended signals and covariances and correlations between 
channels using as estimators the sample version of these functions, i.e. the 
autocorrelogram and the correlogram, respectively. These functions are esti- 
mated over each record of interest, and then averaged over all records. 

Following page 321], we define the sample covariance 712(0 and the sample 
correlation pu^l) between the signals si{t) and S2(t), where t is a discrete 
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variable, as: 

N-\l\ 

/,,,N -L 

7l2 



ZlTl (^i(^) - + t) - < |/| < (iV - 1) (4.3) 



i=l 



Pi2m) = ^^^==, 0<|/|<(iV-l) (4.4) 

and the sample autocovariance 7(/) and autocorrelation p{l) of the signal s{t) 
as: 

N-\l\ 



N-\l\ 

= iv^m 5Z - ^)(^(^ + i) - 5), < |/| < (iV - 1) (4.5) 

' ' i=l 

P(l^l) = ^, 0<|/|<(iV-l) (4.6) 

where s is the sample mean and o"^ is the sample variance. If not differently 
specified, the chosen mean will be the linear trend. 

For the properties of these estimators, we refer to the section lA.ll of appendix 
lAl and to and references herein. Here we recall that these functions are 
unbiased estimates of the true functions if the mean is the expectation, while if 
the mean is estimated from data, these estimators are asymptotically unbiased. 

To take into account the subtraction of different mean values over different 
subintervals, we propose to change the coefficient j:^h\r\ to ^^qip^, where k is 
the number of subintervals used to evaluate the changing mean, meaning that 
the degrees of freedom of this estimation are reduced by the multiple evalua- 
tion of the local mean. Unfortunately this correction changes the properties of 
the previous estimators, that becomes biased even if the mean coincides with 
the expectation. However, it is asymptotically unbiased. The proof of this 
result is given in appendix lAl par. lA.li 

Hence, we choose the following estimators, proposed by e.g. Parzen (see 
for references): 

N-\l\ 

m) = ^ E(^i(*)-^"i)(^2(^+t)-i2), o<|/|<(iv-i) 
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It is biased even if the mean is not estimated from the data, but in general 
has a smaller mean square error than the previous one. In particular, in our 
case we want to smooth the lobe effect of the 1/(A^ — |/| — k) factors for great 
values of caused by the average over a decreasing number of factors. 

4.3.2 Void channels 

The analysis of void channels, i.e. channels in which the stellar beams are not 
injected, is useful to identify the environmental working condition. The pres- 
ence of some noise can be expected, due to the laboratory thermal background 
and scattering. Figure 14.21 shows a record of the four outputs with a zoom. 




OPD Itim] OPD Itim] 



Figure 4.2: The four void channels representation with a zoom (right); the 
channels are vertically shifted for better understanding 

The analysis of the autocorrelograms reveals that noise channels are self- 
uncorrelated (see fig. 14.31 where the PA and the II channels are taken as 
representative), while the cross- correlograms of fig. 14.41 show that they are 
also cross-uncorrelated, as we could expect and hope. 

It is useful to verify the statistical properties of these signals, with our previous 
considerations in mind. First of all, we take a look at the histograms (fig. 
14.51) . The number of classes is limited by the finite range of possible values 
taken by data. The red line is the probability density function of a normal 
distribution with same mean and variance values as the data. The Lilliefors 
test for normality is evaluated; its p-value is < 0.01, so there is statistical 
evidence of normality distribution of data. 

We recall that we are working with integer values, hence the test could not be 
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Figure 4.3: Case 1: Autocorrelation function for photometric PA (left) 
and interferometric II (right) channels when no flux is injected in 



reliable, since it is based on continuous distribution. However the sample size 
set is huge (52600 samples - 100 records of 526 samples each). This allows a 
rescaling of the data that determines a continuous limit. The results of the 
test seems then to confirm the normality hypothesis. 



4.3.3 Input: photometric signals 

We now investigate the sample autocorrelation and sample cross-correlation 
of photometric channels, following eq. 14.71 when flux from stellar source is 
injected in both arms of the combiner. We recall that photometric flux cor- 
responds to half the intensity of the input beam fed to the instrument. The 
mean evaluation problem now arises. We flrst evaluate the mean, used in the 
estimation of the autocorrelogram function of the photometry channels, as 
a global value over each record. Then, all the autocorrelation functions are 
averaged over the records. In flg. 14.61 the results are shown. 
The autocorrelograms tend to zero very slowly. This is a consequence of the 
presence of the linear trend, that is not affected by the offset elimination. A 
linear trend is a strong correlation between consequent time samples. 
We then perform a detrend operation, using the detrend function of Matlab 
described in par. 14.3. 1[ We compute again the autocorrelogram functions, 
and we compare two different situations: for the former, the detrend operation 
is performed over flfty-sample subintervals, for the latter, over ten-samples 
subintervals. The results are shown in figure 14.71 for the photometric channel 
A; the photometric channel B is similar, and can be found in appendix [Bl 
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Interferomelric channels 1 & 2 
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Photometric channel A ar 



Photometric channel A ar 



Figure 4.4: Case 1: First row: cross-correlogram functions for photometric 
PA and PB (left) and for interferometric II and 12 (right); second row: 
cross-correlogram between inputs and outputs - PA and II (left) and PA 
and 12 (right) 



The correlation between distant lags (/ > |20|) drops to zero, but there is a 
residual correlation for smaller lags that cannot be explained by the trend. 



We now investigate the possible cross-correlation between the photometric 
channels. We know that they come from a common stellar source, that their 
paths from the collection at the telescope till the detection in the laboratory 
are similar, but they can be subject to different sources of noise with different 
amplitudes. 

Again, we find a difference if signals are detrended or not, as fig. 14.81 shows. 
However, the differences are smaller than for the auto-correlations. 



We investigate on the cross-correlation estimation for the input channels in 
case 2 and 3. We remember that in these cases one channel is fed, while the 
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Istogramma (VIMCI 2002-07-1 5_ncine_data.sta in registi'o_none stw 6v'52600c) 
PA = 52600*1 "normalCx, 5,2716, 1 ,9909) 




I PA: N = 52600, Media = 5,27159696, DvStd = 1,99086282, Max = 13, Min = -4; |10 
D = 0,0994621 095, p =: 0,01 00, Lillielors-p < 0,00999999978 



istogramma (VIMCi 2002-07-1 5_ncine_data.sta in registi'o_none stw 6v'52600c) 
II = 52600"! •normalCx, 7,2551, 1,9694) 



1: Ivl = 52600, Media = 7,25509506, DvStd = 1 , 
D = 0,103166575, p =: 0,0100, Lillielors-p < 0, 



3139,Max = 15,Min = -1 



istogramma (VIMCi 2002-07-1 5_ncine_data.sta in registro 
PB = 52600*1 "normaiCx, 4,7165, 1 ,79 



PB' M = 52600, Media = 4,71 646388, DvStd = 1 ,79862948, Max = 1 2, Min = -3; 
D = 0,1104751 61 , p =: 0,01 00, Lillielors-p < 0,00999999978 



istogramma (VIMCi 2002-07-1 5_none_data.sta in registi'o_none stw 6v'52600c) 
i2 = 52600"! •normal(x, 5,1303, 1,9312) 



2: Iv] = 52600, Media = 5,1 3028517, DvStd = 1 ,931 1 5288, Max = 13, Min = -3; 
D = 0,104346697, p =: 0,01 00, Lllllelors-p < 0,00999999978 



Figure 4.5: Case 1: Histograms for photometric PA and PB (first row) 
and for interferometric II and 12 (second row). The red Hne is the density 
of the normal distribution with parameters estimated from the data. See the 
Lilhefors p-level for test results. 



other is void. We have seen in the previous paragraph that void channels 
contains self-uncorrelated 'noise', while in the fed channel the signal is self- 
correlated. We check if there is a correlation between these different channels. 
Fig. 14.91 shows the results for case 3 (channel PA void, channel PB with flux), 
with raw data (left) and after a detrend of PB (right). We can see that in 
both cases the cross-correlation drops immediately to zero, as we can expect 
from the features of the 'pure noise' of PA. Case 2 is similar, and it is reported 
in appendix. 



4.3.4 Output in calibration mode 

In calibration mode, the output channels do not contain fringes. We analyze 
their performances, however, to characterize their behaviour. In figure I4.1UI 
the autocorrelation estimates are shown for channel II for case 2 (first row) 
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Figure 4.6: Case 4: Raw data, autocorrelation functions for photometric 
channels 
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Figure 4.7: Case 4: raw data, autocorrelation functions for photometric 
channel A. Left, the linear trend to subtract is evaluated as a piecewise 
polynomial with breakpoints every 50 samples; right, breakpoints are every 
10 samples. 
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Figure 4.8: Case 4: Cross-correlation functions for photometric channel A 
and B. Left, raw data; right, linear trend subtracted. 




Figure 4.9: Case 3: Cross-correlation functions for photometric channel A 
and B. Left, raw data; right, linear trend subtracted. 
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and case 3 (second row). There are no big difference from the photometric 
input case, i.e. raw outputs have a slowly decreasing autocorrelation function, 
while the detrended signals shows no correlation, apart from lags near zero. 
So the trend induces self-correlations on signals. 



Interferog ram 11: auiocorrelog ram Interlerogram 11: auiocorrelog ram 




"O 200 400 600 BOO 1000 1200 ' 200 400 600 BOO 1000 1200 

lags lags 



Figure 4.10: Case 2-3: Autocorrelation functions for interferometric chan- 
nel II for case 2 (first row) and case 3 (second row): raw data (left) and 
detrended (right). 



The estimation of the cross-correlation does not show any unexpected pattern. 
Due to the fact that there are not fringes, the output channels are very similar 
to the photometric inputs. These functions (see fig. 14.111 for case 2) do not 
reveal any particular residual effect due to the combination process that adds 
up the inputs and then split the sum into the output beams. Case 3 is in 
appendix [Bl 
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Interferomelric channels 15 2- detrended Intederdmetnc channels 1 & 2 
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Figure 4.11: Case 2: Cross-correlation functions for interferometric channel 
II: raw data (left) and detrended (right). 



4.3.5 Output in observational mode: Interferometric 
signals 

The interferometric channels, without the trend subtraction, show a behaviour 
similar to the photometric ones. They need a careful treatment because they 
contain fringes, i.e. the modulation part which contains scientific information. 
Moreover, being the result of an interference between coherent beams, we 
would like to find this characteristic into the correlation functions. However, 
in these sets of measurements the amplitude of beam variations is comparable 
with the fringe amplitude. The autocorrelation function refiects this feature, 
i.e. we can't recognize the presence of the fringes, that are lost. Fig. 14.121 
first row, shows the autocorrelation function for the output beams II and 12. 
Again, we perform the detrend operation. The subtraction of the linear trend 
does not cancel the fringe patterns, at the contrary, the modulation part is 
evidenced. Now the autocorrelation function shows very peculiar features, as 
shown for the II channel in the second row of fig. I4.12[ The companion 12 is 
very similar. 

We can easily recognize two components, associated to the interferometric 
signal components. The modulation part is the sum of sinusoidal waves at 
different wavelengths, so we can expect the behaviour of a harmonic process, 
while the offset is dominated by the photometric fluctuations, with the presence 
of a long term correlation. 

The same holds for the cross-correlation function, too, as flg. 14.131 shows. 
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Figure 4.12: Case 4. First row: Raw data, autocorrelation functions for 
interferometric channels II (left) and 12 (right). The function shape is simi- 
lar to photometric channels (fig. 14. 6p . Second row: autocorrelation function 
after a detrend for interferometric channel II (left) and a zoom in the central 
lags area (right). 
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4.3.6 Cross-correlation between photometric inputs and 
interferometric outputs 

The estimate of the cross-correlation function between the photometric and the 
interferometric signals reveals that the correlation is caused by the trend, as in 
previous cases. We shows in figure 14.141 as an example, the cross-correlation 
between the interferometric channel II and the photometric one PA before 
and after the trend subtraction to each beam. It has to be noticed that each 
beam is detrended independently. 



I ntarlero metric channels 1 and photometry A I nterfero metric channels 1 and photometry A - detrended 
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Figure 4.14: Case 4. Cross-correlation functions for interferometric II and 
photometric PA channels. Left, raw data; right, linear trend subtracted. 
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4.4 Statistical analysis in the frequency do- 
main 

Spectral analysis can help in understanding the nature of noise on signals under 
study. Of course, we would expect noise sources to be white processes, and 
the interference pattern to be well distinguishable from noise. 
In particular, we investigate the possibility that the trend might mask other 
higher frequencies features. 

4.4.1 Spectral methods description 

The analysis in the spectral domain is performed using both the power spectral 
density function and the Allan variance. There is a relation between these 
quantities, well established in the case of wide-sense stationary time series. 
For references, see |36j . 

Power Spectral Density 

The Power Spectral Density (hereafter, PSD) can be defined in several ways. 
The need of the PSD function estimation instead of the energy spectral density 
arises when signals are such that their total energy is not finite. For a detailed 
discussion on this topic, see, e.g., or [37] . 

We estimate the PSD with the periodogram ^{k), < k < N — 1, i.e. the 
squared modulus of the discrete Fourier transform of the signal s time series 
{sj}j=o...Af-i) with the appropriate normalization: 

1 

j=0 

^{k) = 0<k<{N-l) (4.8) 

where the relation of k with the frequency effectively recoverable from the 
data can be found with the information on the data scan of Sec. 14.21 From the 
literature [371 Pg- 209], we know that this estimation of the PSD is affected 
by bias problems, and it is not consistent. 

To reduce the measurement noise variance, we apply a smoothing operation on 
$(fc) averaging over a five-samples window, moving the window one sample at 
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a time. Moreover, we evaluate the periodogram over all available records for 
each case, say M, and we average all periodograms to get a final estimation of 
the PSD of the time series. This procedure is equivalent to a welch smoothing 
without overlapping of the window, and it leads to a decrease in the standard 
deviation of the estimation as -kj. 

The spectral bias problem can arise from a sharp truncation of the sequence, 
and can be reduced by first multiplying the finite sequence by a window func- 
tion which truncates the sequence gracefully rather than abruptly. In our 
calculation, we chose the Hamming window, defined as: 

k 

w{k + l) = 0.54-0.46- cos{27i— — -), k = 0,...,N -1 (4.9) 

If n is odd, this window is symmetric around the median point; if n is even, 
it does not have a central point. Figure H.15I shows the Hamming window for 
N = 500 points. 



0.8 



Figure 4.15: Hamming window with N = 500. 



The presence of a non zero mean has infiuences especially at low frequencies. 
Even if the mean was constant over all records, the leakage! phenomenon would 
spread the frequency peak of the mean on the neightbour frequencies, possibly 
obscuring low frequencies components of the spectrum, if present. 

^Leakage: contribution of the sinusoidal components with frequencies lo ^ ujq to the 
periodogram value ^{loq) 
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Allan variance 

Another technique we have applied for the understanding of the spectral be- 
haviour is the Allan variance, or two samples variance. It was introduced in 
the sixties [36] [38] , in the field of frequency stability measurements for metrol- 
ogy signals, and is now widely used. It was originally conceived to avoid the 
lack of convergence of the usual variance with some orders of power-shaped 
spectra. In astronomic science fields, it was first used in radio astronomy, for 
the phase difference time series [HI pag. 272] , but also for a general time series 
[39] . Some authors, such as Colavita [ID], used it for estimating phase dif- 
ference spectral features in the field of infrared interferometry, with a slightly 
different formulation (Allan modified). 

We use the original Allan variance, evaluated over a set of time lags, r, derived 
from a reference lag tq. If 

A[« = -[x,+.„-x,] (4.10) 

^0 

is the average of the time sequence x{t) over the time interval [t,t + r], the 
Allan variance at lag r = kr^ is defined as[ 



N-2k+l 
^ ' n=l 



" 2THN-2k + l) ^ i^n+2k-2Xn+k + Xny. (4.11) 
^ ' n=l 

Since the summation goes from 1 to N — 2k + 1, the last is the maximum 
number of independent factors. High lag terms are affected by errors due to 
the average over a small number of values. 

The exponent of the variance, as a function of the lag r, can be related directly 
to a range of power-shape spectra thanks to the following relation: 

Sy{f)=a^-r - crl{T) = ~a^-T^ (4.12) 
valid for —2 < a < 2, and where a and (3 are linked by: 

p = -a-l. (4.13) 

In particular, for a flat {a = 0) and a 'flicker' noise a = —1, the Allan variance 
exponents are (3 = —1 and (3 = 0, respectively, with coeflicients a_i = 1/2 ■ 
and fio = 2/r;,(2)a_i. 
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4.4.2 Spectral analysis of raw data 

We perform a spectral analysis both on calibration channels and on channels 
with injected fluxes. The former are useful to assess the features of the envi- 
ronmental noise, features that are covered by flux patterns in the latter case. 
Figure I4.16[ first row, shows the power spectral density for a photometric and 
an interferometric channel, when both arms are not fed with stellar source. 
We process each record according to the described method of section 14.4.11 
with the final PSD average over M = 100 records. 

We apply the same technique when flux is injected in both arms of the interfer- 
ometer, with the difference that the final average is performed over M = 500 
records. Comparing the photometric performances, we can notice that when in 
the interferometer there is just noise, the PSD drops very quickly to the high 
frequencies intensity offset, while the presence of flux induces a smoother de- 
crease of the PSD intensity. The interferometric performances, on the contrary, 
are dominated by the presence of the modulation frequency, clearly identifiable 
(fig. 14.161 second row). 

We can see, in both photometric and interferometric channels, the presence of 
two low-frequency peaks that have highest magnitude than the surrounding 
noise. 

If we apply a detrend operation to the observational data (figure HUT]) , low 
frequencies are suppressed, because of the slow motion of the photometric 
intensities. The leakage phenomenon is reduced or eliminated. Figure 14.171 
illustrate this behaviour for the photometric channel PA and for the interfer- 
ometric one II. 

We compare with the PSD evaluated for case 2 and 3. Figure 14.181 shows 
photometric PA and interferometric II for case 2. We can notice that the 
two low-frequency peaks are no longer present, not even in the photometric 
channel. This fact can be interpreted in two ways: it is due to a cross-effect 
between the two beams when injected into the instrument, and before being 
separated in two parts, or it was merely an observational noise, feature of that 
set of data, and not repeatable. 

Effects of the application of the spectral window 

To understand the role of the spectral window, we explore the behaviour of 
the averaged periodogram without the application of the Hamming window. 
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Photometric channel A - PSD Interferomelnc channel 1 - PSD 




100 200 300 400 500 600 700 800 100 200 300 400 500 600 700 800 



frequency (Hz) frequency (Hz) 

Figure 4.16: PSD estimation for the photometric channel A (left) and for 
the interferometric channel 1 (right). First row: case 1 - no flux injected - 
note the low-frequency components. Second row: case 4 - flux injected - it 
is evident the frequency range of the fringes, well distinguished from noise. 



We notice, comparing the second row of figure 14.161 with figure 14.191 that the 
windowing has the effect of whitening the estimated PSD, sharpening its drop 
toward the fiat offset, and to reduce the offset of the high-frequency white 
noise, as expected. 

It is interesting to notice that the window effect on the PSD estimation depends 
on the frequency, since its effect is greater at low frequencies, where the power 
magnitude is higher. This is probably due to the characteristic of the spectrum 
of the window, since the multiplication of two functions in the time domain 
corresponds to the convolution of their Fourier transforms. 
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Photometry A - detrended - PSD 

9 Intefferometric channel 1 - delrended - PSD 

450 I 1 1 1 1 1 




100 200 300 AOO 500 600 700 800 100 200 300 400 500 600 700 BOO 

Ireqyency (Hz) frequency (Hz) 

Figure 4.17: Case 4. Power Spectral Density estimation for the photomet- 
ric channel A (left) and for the interferometric channel 1 (right). All the 
power area is the frequency range of the fringes. Note the different scale 
of figures, used to highlight the low-magnitude patterns of the PSD in the 
photometric case. 



Effects of the subtraction of an estimated trend on the PSD evalu- 
ation 

We have seen, in the previous paragraphs, how the slow evolution of the flux 
observed on the photometric channels has a strong impact on the features of 
the signals, both in the temporal and in the frequency domain. We have re- 
marked how the estimation of this trend had to be carefully handled, because 
it is changing in time. 

We must underline that what we subtract from the raw signal is just an esti- 
mation of the trend of the data, and we can expect that the features of this 
estimate reflect on all other estimated functions. We have analyzed in details 
this problem when the trend adds up to a wide sense stationary process, i.e. 
a stochastic process whose first and second order moments do not depend on 
time. We have given the error on the expectation of the PSD in the simple case 
of a constant trend subtraction, and an asymptotical value for this expectation 
in a general case (see appendix . 

Here we just mention the result of interest in our situation. Let us suppose that 
the signal can be expressed as the sum of a trend and a residual process. Let 
the residual process be such that it possesses a continuous spectral description 
with density f{uj), where u is the frequency variable. If the trend functional 
form t = g{x) is such that the x variable has suitable properties (i.e. the 
regressors have no upper bounds, they increase slowly, they have a correlation 
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Photometric ctiannel A - PSD 



Ptlotometric ctiannelA-detrended-PSD 
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Figure 4.18: Case 2: Power Spectral Density functions for photometric 
input PA (first row) and interferometric output II (second row): raw data 
(left) and detrended (right). 



matrix which is non singular for the zero lag), the regression estimate of the 
trend coefficient is the best linear one. Moreover, the asymptotical behaviour 
of the bias can be formulated and depends on the window size and on /(cu). 
Of course, it is necessary to have prior information on the residual process. 



4.4.3 Allan variance 

We would like to con&m our spectral results using the Allan variance tool. 
Let us analyze data referred to case 1. We ffist generate a realization of a 
family of random variables, each distributed as standard gaussian r.v. A^(0, 1). 
The family dimension is 526 samples, in analogy with each record analyzed. 
We then compare the Allan variance of this family with each record of the 
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Photometry A - PSD _ Interferometric channel 1 - PSD 




100 200 300 400 500 600 700 800 100 200 300 400 500 600 700 800 



frequency (Hz] frequency (Hz] 

Figure 4.19: Case 4. Power Spectral Density estimation for the photomet- 
ric channel A (left) and for the interferometric channel 1 (right) without the 
Hamming window. 



inputs channel PA and PB and of the output channels II and 12. As we 
could expect from the spectral analysis, we find strong similarities between the 
gaussian family and the astronomical data. Figure 14.201 shows this behaviour 
of the void channels for some records (10) for a photometric channel, PA, and 
an interferometric one, II. Due to the regularity of the records pattern, it 
is useful to consider the mean of all records for the different channels, and 
this averaged Allan variance is shown in fig. 14.211 always compared with the 
reference white noise. 



White noise random process and PhotometryA ^ While noise random process and Intensity 1 




Figure 4.20: Case 1. Allan variance comparison between a realization of a 
gaussian white noise (red) and ten records (blue) of the photometric channel 
PA (left) and the interferometric channel II (right). 



The situation changes when we consider data from case 2 and 3, i.e. when just 
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White noise random process and PholometryA ^ White noise random process and PhotometryB 




10° 10^ 10^ 10° 10° 10^ 10^ 10° 



Figure 4.21: Case 1. Allan variance comparison between a realization of a 
gaussian white noise (red) and ten records (blue) of the photometric channels 
(first row) and the interferometric ones (second row). 



one photometric channel is fed with flux, while the other is left void. Figure 
14.221 shows, for each picture, ten records of a channel compared again with 
the gaussian white noise described before. Since the patterns are not regular 
between records, we do not average. We can however recognize a common 
pattern: the first part of each variance function is very similar to the reference 
white noise, and then the pattern changes. Moreover, the intercept of the 
variance line changes from record to record, and this is caused by a changing 
variance value (just remember that this value is an average value over all 
intervals of a certain dimension). 

We can then conclude that locally the signal acts like a gaussian white noise, 
over moving intervals of about 10 samples, than other features appears. Sam- 
ples separated by a lag r < 10 can be considered uncorrelated, while for higher 
lags (r > 10) the r exponent changes dramatically. We can recognize, for cer- 
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White noise random process and PholometryA White noise random process and PholometryB 




Figure 4.22: Case 2. Allan variance comparison between a realization of a 
gaussian white noise (red) and ten records (blue) of the photometric channels 
(first row) and the interferometric ones (second row) for case 2. Flux is 
injected in channel PA (first row, left), while PB is void (first row, right). 
Interferometric channels are both interested. We remember that there is no 
interference pattern. 
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tain records, a straight line for intermediate lags (10 < r < 40), that is in 
relation with a noise with spectrum oc 1//. Some patterns show a curve that 
could indicate an underlying sinusoidal process, quite strange in photometric 
signals. 

For higher lags (r > 40), there is not a clear indication of a known pattern. 
The final lags are dominated by fiuctuations caused by the variance estimation 
over a poor number of terms. 

Finally, case 4 of observational data is interesting. It is shown in figure 14.231 
For the photometric channels (first row), the same considerations as before 
apply. Moreover, we can say that the photometric channel PB behaves differ- 
ently from PA, its samples seems to be less correlated. This could mean that 
the different travel of each beam before the combination induces perturbations 
that are different not only for the magnitude, but also for their statistical 
properties, such as correlation. 

For the interferometric channels, however, we can notice several things. First 
of all, for some records it is not possible to find a white-noise-like behaviour, 
even at low lags. The presence of the modulation is recognizable thanks to the 
oscillations of the Allan variance. These records are the ones in which fringes 
effectively formed. 

Other records, in which fringes are not present, behave like interferometric 
channels of case 2 and 3. We made the correspondence between variance pat- 
terns and fringes formation records by visually comparing them. 

Optimization of this basic evaluation of the Allan variance, such as the Dy- 
namic Allan Variance [41]. should help in identifying also when different pat- 
terns appears or when they are covered by other effects. 

We can finally conclude that the analysis of the Allan variance for these data 
is useful, because it allows to recognize the scale at which the different types 
of noise appear. 

An extensive study, based on a large amount of data in different working 
conditions, should help to identify these different noise sources, how often 
they appear, and to search their influence on instrument performances, such 
as the OPD/GD estimator algorithms we have seen in the previous chapters. 
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Figure 4.23: Case 4. Allan variance comparison between a realization of a 
gaussian white noise (red) and ten records (blue) of the photometric channels 
(first row) and the interferometric ones (second row). 
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4.5 Conclusions for statistical analysis 

In the previous paragraphs, we have analyzed in details typical interferometric 
data with statistical classical techniques, first in time then in frequency do- 
main. We are now able to resume the most important features of our data. 

First of all, if we consider the calibration data, we can see that the peculiarity 
of signals before combination are maintained even after. 

We can recognize two different components on the signals: a linear 'trend', that 
contains the macroscopic fluctuations, and a residual variability, that we can 
define 'microscopic'. The former is the responsible of the presence of a slowly 
decreasing autocorrelation (and cross-correlation between different channels). 
In fact, if we subtract it from the raw data, all auto/cross-correlations drops 
immediately to zero, apart lag zero. The latter component is an uncorrelated 
process. 

This results are confirmed by the spectral analysis. The power spectral densi- 
ties are affected by low-frequency components that can be linked to the slow- 
moving trend on data. Once this trend is subtracted, the PSDs confirm that 
the residuals are uncorrelated signals, apart from frequencies around the zero, 
which have a pattern that could be a leakage of the zero- frequency components. 
These considerations can appear in contrast with the results of the Allan vari- 
ance, especially for the photometric inputs. In fact, this technique shows how 
photometric signals can be considered locally uncorrelated over ^ 10-samples 
sized intervals. On the contrary, since the Allan variance is based on the differ- 
ence of samples separated by a certain time lag, for sufficiently close samples 
the trend can be considered constant, and it is eliminated by the subtraction. 
If the time lag is larger than the length of 'stationarity' of the trend, it is no 
longer subtracted, and it induces fluctuations on the variance. 

We are able to conclude that the the linear trend can be considered locally 
constant over an interval of 10 samples, which corresponds to 6.9 msec, or 
equivalently to 4.5 /im.. At the working wavelength of VINCI, i.e. ~ 2.0 /im, 
the linear trend is stable over two fringes. 

These considerations imposes some constraints for the use of these data in a 
fringe sensor, such the need of preprocessing raw data to subtract the linear 
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trend before applying location algorithms. 

An Allan variance tool would be very helpful, either on-line or off-line, in the 
diagnostics of operating conditions of interferometric instruments and for data 
calibration. 

4.6 Variance analysis 

In previous Sections we have shown that the signal is not stationary, and it 
is possible to isolate a trend. However, the dctrendcd signal may still not be 
stationary. For this reason, we study here the evolution of the signal variance. 
We are no longer interested in signal mean, that we know to be characterized 
by the trend, so we use detrended data. 
In particular, we are looking for two different tests: 

1. given a selected channel, we search for changes of the variance as a func- 
tion of time 

2. given a specific time record, we want to see if the properties of the vari- 
ance of the input channels are the same of the variance of the output 
channels, in terms of homogeneity 

Given the huge size of the data to be analyzed and its organization in a number 
of signals divided in hundreds of records, the use of a statistical software, 
such as Statistica, has required an effort to manage data, in order to suit 
the software requirements (organization of variables in groups, levels, repeated 
measures and so on). 

4.6.1 Statistical methods for variance analysis 

We study the variance of the signals using statistical tests, in particular we 
test variance patterns synchronously on different channels. 
For the test of variance homogeneity, we use the Levene test, usually contained 
in the ANOVA analysis tool. Given a group of data sets, also called a level, 
the test distinguish between the variability of samples in the sets with the 
variability between different sets, to explain the total variability of all the 
samples. In formula, given k sets of samples, each with A^j samples Zij, i = 
1 . . .k, j = 1 . . . Ni, and marginal mean Zi,, the test compares the variability 
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between the marginal means and the overall mean z.. with the sum of the 
variability in each set. If the variances are equal, the ratio F approximates 1: 



The numerator is the variability between the sets of samples, while the de- 
nominator is the sum of the variability inside each set. The test evaluate the 
ratio F, and give a statistical significativity p. If the p- value p is under a spec- 
ified threshold, the diff'erence between the variances cannot be attributed to 
a random eff'ect, but it is likely to have been generated by a true variability 
with a confidence level (1— threshold), so we should reject the hypothesis of 
homogeneity of variances. 

With our data, the level is composed hy k = 100, 500 set (calibration sets and 
observational sets, respectively), while the number Ni is the same for each i: 
Ni = 526, 376 (the difference for the observational sets is to avoid the presence 
of fringe, that can disturb the variance evaluation). The two tests proposed 
foresee a different data organization, that will be described in the relative para- 
graphs. 

Some authors (Glass and Hopkins |12]) have pointed out that the Levene test 
and its modification (such as Brown-Forsythe) are based on the variance ho- 
mogeneity requirement; a lack of symmetry in the distribution of the deviation 
from the means, for example, can cause a violation of the normality required 
for the F test. They highlighted that it is not clear if these tests are robust 
against a great heterogeneity of variances and sets with a different dimension. 
In our tests, however, the significant number of samples in each set and the fact 
that sets are equally dimensioned should prevent us from misinterpretation of 
the results of Levene test. 

In our tests, we choose a confidence level of 95%. 

4.6.2 Analysis of homogeneity of variance 
Levene test 1 



F 




(4.14) 




.k 



For this test, given a selected channel, we want to know if the variance at a 
time ti is equal to the variance at time ^2- So we have the following null and 
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alternative hypothesis: 

Ho : ai(ti)) = (Ti(t2), 1 < ti,t2 < 100(500) 
Hi : 3h,t2 s.t. ai(ti)) ^(Ti(t2) (4.15) 

The dependent variable is the channel, while the group variable able to dis- 
tinguish between different sets is the record number, ranging in [1, 100] for 
calibration sets and [1, 500] for observational ones. The record variable creates 
100 or 500 groups at the same level. In each group, the variance is evaluated 
and compared with the variance of all other groups. We have just one result 
in each channel, saying if the variance is changing. The p-value explains the 
significance of the result: since we have chosen a confidence level of 95%, if 
the p-value is < 0.05, we can't accept the hypothesis of variances equality. 

Case 1. If the two arms are not fed with flux, both photometric and combined 
channels are stable with respect to the variance, since all p-values are 
above the critical value 0.05, as table HJl shows. 



Channel 


MSEffect 


MS Error 


F 


P 


11 


1,372893 


1,267551 


1,083107 


0,269069 


12 


1,332706 


1,221557 


1,090990 


0,252061 


PA 


1,394501 


1,292858 


1,078618 


0,279041 


PB 


1,222665 


1,059943 


1,153519 


0,141204 



Table 4.1: Levene test for Homogeneity of Variances - case 1, without flux 

Case 2 and 3. If just one interferometer's arm is fed with flux from a source, we find 
that the only channel that maintains the variance homogeneity property 
is the void channel (p-value > 0.05). Table shows this result for the 
case 2 (channel A fed, channel B empty), while table 113] is similar, but 
for case 3. 



Channel 


MSEffect 


MS Error 


F 


P 


11 


123,1234 


3,399613 


36,21689 


0,000000 


12 


306,9551 


5,208085 


58,93818 


0,000000 


PA 


197,0255 


4,244176 


46,42255 


0,000000 


PB 


0,9752 


1,067226 


0,91379 


0,717938 



Table 4.2: Levene test for Homogeneity of Variances - case 2, channel A 
fed, channel B without flux 
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Channel 


iViOil/IieCt 


MS Error 


r 


P 


T1 


QD 1 Q9Qn 


1 89988 

O, XOZiZiOO 


98 '?/191 ^ 


n nnnnnn 


12 


46,04302 


2,364202 


19,47508 


0,000000 


PA 


1,43348 


1,284517 


1,11597 


0,202605 


PB 


9,92143 


1,470697 


6,74608 


0,000000 



Table 4.3: Levene test for Homogeneity of Variances - case 3, channel A 
without flux, channel B fed 



Case 4. Finally, we considered the case 4 for which both input channels are fed 
with stellar flux. In this set of data, we have 500 records instead of 
100; for homogeneity with the other cases the test is repeated over 100 
records at a time (tables from l4.4 lto l4.8p . The results are consistent with 
those found before, i.e. all channels have inhomogeneous variance, since 
p-values are smaller than 0.05. 



Channel 


MSEffect 


MSError 


F 


P 


11 


327,3119 


8,94604 


36,58735 


0,00 


12 


397,9197 


10,00692 


39,76444 


0,00 


PA 


157,4086 


4,00455 


39,30744 


0,00 


PB 


12,4603 


1,36514 


9,12750 


0,00 



Table 4.4: Levene test for Homogeneity of Variances - case 4, channel A 
and B with flux, record from 1 to 100 



Channel MSEffect MSError F p 

11 143,8233 4,893480 29,39079 0,00 

12 313,9400 6,308566 49,76409 0,00 
PA 168,0298 4,04593 41,53058 0,00 
PB 1,7106 1,153894 1,48249 0,001 

Table 4.5: Levene test for Homogeneity of Variances - case 4, channel A 
and B with flux, record from 101 to 200 



We can conclude that the beam flux has a real variability along the time scale. 
This is of course reflected on the combined channels. The Levene test ensures 
us that this is a true inhomogeneity because it takes care of the changing mean 
value over different records. 
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Channel 


Moil/Iiect 


MS Error 


r 


P 


T1 

11 


qqq '?'?q'? 


99 D'?! 98 


,ouuuo 


n nn 


12 


995,6715 


22,337121 


44,57475 


0,00 


PA 


211,2722 


4,81973 


43,83484 


0,00 


PB 


35,1254 


1,64634 


21,33541 


0,00 



Table 4.6: Levene test for Homogeneity of Variances - case 4, channel A 
and B with flux, record from 201 to 300 



Channel 


MSEffect 


MS Error 


F 


P 


11 


668,3774 


38,57011 


17,32890 


0,00 


12 


639,9936 


38,75130 


16,51541 


0,00 


PA 


119,9992 


5,25826 


22,82107 


0,00 


PB 


25,0272 


2,14108 


11,68906 


0,00 



Table 4.7: Levene test for Homogeneity of Variances - case 4, channel A 
and B with flux, record from 301 to 400 

Levene test 2 

With this test, we want to assess if the variance in two channels changes over a 
fixed time interval At. So we consider the simultaneous records of two different 
channels, and we evaluate each variance, and then we compare them. We can 
repeat this procedure for all the records (100 or 500). We remember that we 
are working, for these tests, with sample variance. Now, the hypothesis are: 

Ho : ai(At) = a2(At) 

Hi : ai{At) ^ a2{At) (4.16) 
where At varies along each record. 

Data has been organized in order to have 100 (500, respectively) variables, 
representing the repeated measures, each containing one record of the two 
channels under testing. The channel variable creates 2 groups at the same 
level, and the variances of these two groups are evaluated and compared. This 
test can be repeated on both channel pairs, i.e. input and output. 
This test shows (table 14.91) an interesting feature of the combination system. 
In general signals after the combination are more balanced than before, in 
terms of fiux intensity. This means that the combination/splitting system is 
able to sum up factors with different fiux intensities and to split the sum into 
balanced part. Moreover, it says that when fiux from a source is injected 
in both channels (case 4), beams coming in front of the combining system 



Statistical analysis of interferometric data 



129 



Channel 


Moiiinect 


ivioiiirror 


r 


P 


T1 

1± 




44 fi474Q 




n nn 

u,uu 


12 


690,7584 


44,31748 


15,58659 


0,00 


PA 


135,0856 


5,30639 


25,45718 


0,00 


PB 


18,8933 


2,22661 


8,48522 


0,00 



Table 4.8: Levene test for Homogeneity of Variances - case 4, channel A 
and B with flux, record from 401 to 500 



Case 


PA and PB 


11 and 12 


1 


40% 


90% 


2 


2% 


16% 


3 


66% 


28% 


4 


5.2% (26/500) 


65.8% (329/500) 



Table 4.9: Levene test for Homogeneity of Variances in two synchronous 
channels 

have very different amplitude. Case 2 and 3, in which only one arm of the 
combiner is fed, are less interesting, even if we can say that when there is a 
strong unbalance between the two input channels, the combined outputs are 
less stable. 

4.7 Conclusions for variance analysis 

The analysis of the variance of the VINCI data has evidenced the following 
features of the handled signals. The first test has given us a statistical evi- 
dence of the fact that the variance of channels fed with stellar flux changes 
in subsequent records. This means that the flux is subject to variation in a 
single observation run (composed of different records), so the parameters of 
the interferometric models have to be updated at a high rate. 

However, the second test has shown that the combination system is able to 
handle properly even unbalanced inputs, and to split them correctly in two 
equal part, not only in terms of mean, but also in terms of variance. 

A test of this kind can be used to check instrument reliability and repeatabil- 
ity, for both on-line and off-line analysis. In fact, off-line analysis must include 
instrument parameter estimate at low level, and a real-time instrument like 
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a fringe tracker must foresee an on-line update of operating parameters on a 
comparably short time scale, e.g. faster than 1 Hz. Suitable diagnostics mod- 
ules have to be included to ensure that data quality is preserved throughout 
the observation. 

4.8 Analysis of interferometric output variabil- 
ity sources 

Our further task is the analysis of the variability of the interferometric outputs. 
Some information can be retrieved from the interferogram itself, with statistical 
moments or autocorrelations, as we did before. 

However, the availability of the system inputs, i.e. the photometric channels, 

allows to study the variability of the output beams as a function of that of the 

input beams. In this way, it is possible to investigate if the input variability 

is sufficient to explain the output one, or if we can suspect another variability 

source, perhaps from instrumental contribution. 

This subject is addressed in this section. 

To focus the problem, we have to make some assumption: 

1. the photon noise, the shot noise and the detection noise have the same 
properties over each channel; 

2. the noise level is comparable inside and outside the coherence length. 

The ffist assumption is needed because each signal, photometric or interfero- 
metric, is subject to the detection process independently from all others, and 
there is no way to check differences. The second is due to the fact that the 
modulated part of the signal has a strong variability that can not be considered 
as 'variance'. So it is difficult to analyze the variance in the coherence length, 
and we must trust the results outside this applicable inside, in terms 

of noise estimate and characterization. 

The model we want to use is a simplification of that represented by eq. 14. 2[ 
The two photometric channels, PA and PB, can be factorized in the sum of 
the 'true' values, PA{x) and PB{x) respectively, and of variability sources, 
epA{x) and epB{x): 



PA{x) = P~A{x) + epA{x), PB{x) = P~B{x) + epB{x), 



(4.17) 
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where x is the spatial variable for the optical path difference. Since the OPD 
is modulated to produce the fringes, we have that x = x{t). With the notation 
of eq. 14.171 we hide the dependence of the OPD on the time t. 
When PA and PB are physically combined in order to produce the inter- 
ferometric channels II and /2, also the noises epA^x) and epB^x) enter the 
combination, causing a variability on the interferometric outputs. 
We wonder if the introduction ofepA^x) and epp^x) in the combination process 
is sufficient to explicate all the variability on II and 12. 
In other words, we want to quantify the weight of emi(x) and £^2(2;): 



where mi{x), z = 1, 2 is the modulation function containing fringes. 
The ideal combination is noiseless, i.e. emX^) = 0? i = 1,2. However, we can 
expect some kind of superposed noises, caused, e.g., by the physical instru- 
ments dedicated to the composition/separation of beams (fibres in this case, 
or optical combiners). If it is the case, we further assume that this noise due 
to the combination process is uniformly present on the data. 
Is it possible to quantify its weight? 

In our region of interest, outside the coherence length, we can suppose that 
the linear model is predominant with respect to the modulation function m{x). 
The residual modulation is covered by photometric fiuctuations and noise. 
Is it possible to quantify also the weight of the modulation outside the coherence 
length? 

The statistical tool that can answer these questions is the regression analysis. 
It is suggested by the interferometric model itself (eq. 14.181) . that also address 
the use of a linear model. In particular, the comparison of a linear model and 
a 'mixed' linear model, i.e. with a higher order factor to describe the non- 
linearity of the interferometric combination, can tell us something on the third 
question. 

4.8.1 Review of the multiple regression analysis 

Let consider a general regression model (in matrix form): 



Il{x) 
12{x) 



{Pi,aPA{x) + /3i,bPB{x)) ■ [1 + mi(x)] +e^,(x) 
(/52,aPA(x) + P2,BPBix)) ■ [1 + m^ix)] + em.ix) (4.18) 



Y = (3X + e 



(4.19) 
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where Y is the vector of the observed data, X the regressors vector, f3 are the 
coefficients, and e the model error vector. 

A core principle of the least squares regression method is the fact that the 
variability of a dependent variable can be partitioned over the sources of vari- 
ability, i.e. the predicted variables and the residual error. A fundamental 
identity of the least squares states the following relation between square sums: 

N N N 

J2(y^ - = Y.(y^ - y)' + - y^f ~ = ^^^'^ + (^.20) 
1=1 1=1 1=1 

where y are the observed values of the dependent variable, with mean 
and y are its estimated values through regression analysis. The quantity 
Yl^=i{yi ~ ViY is the square sum of the residuals (SSError, SSE), while SST 
is the total squares sum and SSM is the model squares sum. SSM and SSE 
depend from the model adopted, and they can be evaluated asjlHl pg. 4]: 

SSM = h'X'Y 

SSE = SSTotal - SSModel = ¥'¥ -b'X'Y (4.21) 

The residuals are defined as the difference between the foreseen and the mea- 
sured values: 

£i = yi-yi, i = l...N (4.22) 

If we assume the residuals to be uncorrelated random variables, with zero 
mean and constant variance, and the regressors to be measured without error, 
than the estimation through the least squares approach is optimal, in the sense 
that the variance of all other linear estimators is greater than that of the least 
squares estimator. Note that the residuals are not required to belong to the 
same distribution, or to be independent. 

For tests of significance, the random errors Ei are often assumed to be normally 
distributed. In this case, the least square estimators are also the maximum 
likelihood estimators. 

The ratio of the sum of squares of the regression model to the total sum of 
squares {R^ = = 1 — fff ) explains the proportion of variance accounted 
for the dependent variable (y) by the model. This ratio varies between and 
1. If i?^ = 1, the variance is perfectly explained by the model and there are no 
residuals; if i?^ = 0, the model could explain nothing of the observed data. 
can then be used as an estimator of the goodness-of-fit of the model. However, 
the analysis of residuals is important to validate or not the test on the R ratio. 
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checking if the assumption of the normal distribution of the residual of the 
least squares model is valid. 

The correlation matrix of the parameters can give an indication wether the 
parameters are redundant or not. In fact, if two parameters are highly corre- 
lated, we can infer that they carry/bring the same amount of 'information'. 

We have remarked that some assumptions have to be done to trust the regres- 
sion results. But what happens if one of them is violated? A treatment of the 
subject can be found in Rawlings et al [13]. 

The non normality of the residuals does not affect the coefficients estimation, 
as it is not required in the partition of the squares sums, but can cause trou- 
bles on the tests for their significance and for the confidence intervals, that are 
all based on the normal distribution. Moreover, the estimators are still the 
best between all linear estimators, but are no longer the Maximum Likelihood 
estimators. The same apply if there is a correlation between the residuals. 
Techniques exist to face this problem (generalized least squares), but if the 
residuals covariance matrix has to be estimated by the data, the results could 
be worse than before. 

The property of the minimum variance of the estimators depends directly also 
from the hypothesis of homogeneity of the variance of the residuals. If it is 
not the case, it is necessary to introduce weights on the regression analysis. 

A different treatment is necessary if the independent variables and/or the re- 
gressors have measurement errors. References can be found in [^ET, p. 91], ^51 
p. 123]. We can resume saying that, if the dependent variables have measure- 
ment errors, these errors increases the residuals, reducing the coefficient, 
and so leaving more unexplained variability on the model. 
The situation becomes worst if the regressors are measured with errors. If the 
regressors are fixed and the measurement errors are normally distributed with 
same variance and zero mean, the coefficients estimators will still be unbiased. 
On the contrary, they will be biased if the regressors are random variables, and 
the bias will be function of the correlation between the true unknown random 
regressor and the measurement error variance. 

Finally, we remark that the described method applies also to multiple regres- 
sion, in which there are two or more regressors, as is our case. 
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4.8.2 Method description 

In this paragraph, we will describe the procedure adopted for the regression 
analysis, the outputs produced and how we evaluated them. The analysis 
has been carried out with the software STATISTICA (produced by StatSoft: 
www . st at sof t . com) . 

Our data, as said before, are integer numbers, they can be considered as counts. 
It seems that in this case the normality distribution required for the model error 
are not valid any longer. However, the mean level of counts is high (order of 
hundreds counts). 

With this preliminary remark in mind, we have prepared the data according 
with the introductive discussion. 

First of all, for each record, we've eliminated the coherence length area. 
In order to have information about the possible lack of homogeneity of the 
variance on the channels, we have performed a Levene test for homogeneity of 
variances on the residual data (for description of the Levene test, see section 
14.6.11) . We have divided data into subintervals, to check if the variability of 
data from one subinterval to another was due to mean variation or effectively 
to variance. We have repeated this test on both photometric and interferomet- 
ric pairs of channels, and we have retained records for which the variability 
on the input channels PA and PB is homogeneous. We have further distin- 
guished between records with homogeneous and inhomogeneous variance on 
the output channels II and 12, retaining the first. 

The chosen models do not foresee an intercept coefficient. The reason is that 
in the ideal denoised case of photometric channels set to zero, we want to have 
interferometric outputs set to zero. 

For each considered regression model, we have performed test on the coeffi- 
cients of the regression model, to see if they were null, on the residual unex- 
plained variance, and we have studied the residuals. 

The table summarizing the model description looks like the one in figure I4.24[ 
Here after we explain the meaning of each column in the tables: 

• R multiplo (Multiple R): it is the positive square root of R^. 

• R^: is the coefficient of the multiple correlation. It measures the compo- 
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Test 55 Modello Compl. vs. S5 Residui (AII_PA-PB-l1-l2_StationaryData) 



Dipend. , 
Variabile 


Multipio 
R 


Multipio 

R= 


Aggiust: 1 SS 
R=^ 1 Modello 


gi 

Modello 


MS 
Modello 


SS 
Residuo 


gi 

Residuo 


MS 
Residuo 


F 


P 


11 


0,995947 0,991911 
0,997516 0,995038 


0,991910 
0,995036 


9,974884E408 


2 


498744220 


8134867 


47723 170,4601 


2925871 


0,00 


12 














0,00 



Figure 4.24: Table of tests on the model. This example is taken from 
section 14.8.41 and is referred to the linear model without factors of higher 
order. 



nent of the total variability due to the independent variables. It is useful 
because it takes care of the presence of multiple regressors. We recall its 
definition as the ratio of the squares sums of the model and total sums 
squares: 

2 _ . SSE 
^ -^-^ 

• E? aggiustato ( Corrected R^): it is obtained from the definition divid- 
ing the error squares sums and the total squares sums by their degrees of 
freedom {n — k and n respectively, where k is the number of independent 
variables and n is the number of cases used in the regression) 

• SS Modello and SS Residuo: squares sum of the regression model and of 
the residuals, respectively 

• gl Modello and gl Residuo: degree of freedom (df ) of the regression model 
(A;, where k is the number of non correlated independent variables) and 
of the residuals (n — k), respectively 

• MS Modello and MS Residuo: mean square sum (^) of the model and 
of the residuals, respectively 

• F, p: test to verify the statistical significance of the R^ measures. It is 
computed: 

MS Model 
~ MSResidual ^ ^^'""'^^ 

Note that Statistica automatically marks in red the results that have a positive 
significance, for a quicker understanding. 

Another table of interest is the one shown in figure 14.251 reporting different 
statistics for each regressors, and we resume them: 



136 



Statistical analysis of interferometric data 



Statistiche collinearita per termini in equaiione (AII_PA-PB-l1-l2_St alio nary Data) 
Parametrizzazione sigma-ristretta 



Tolernza 



Effetto 



Varianza 
Fat.infl 



R quadro 



11 

Beta in 



0,3856561 2,592981 



II 

Parziale 



n 

Semi-par 



II 

t 



J1 



PA 




□ ,61 4344 □ ,3671 12 ,930227 ,227981 
0,614344 0, 681 760 : 0.978172 0.423381 



12 I ;i2 

Parzial^ ! Serni-par 



553,740 0,00 
1028,343 0,00^ 



:j2 
t 



0,679006 0,986333 
0,371803. 0,956480 



0,421671 
0,230894. 



1307 ,739 
716,077 



12 



0,00 ] 
0,00 J 



Figure 4.25: Example of correlation analysis for PA and PB 



• Tolerance: it is defined as 1 — i?^. If the tolerance is small, the variable 
is highly correlated with the others variables, and this increases its re- 
dundancy. In the example table, we can see that the tolerance is small 
for every regressors, so there is no redundancy 

• Variance inflation factors (VIF): the elements on the diagonal of the 
inverse of the correlation matrix, used in the model computation. It is 
another measure of the redundancy of the variables: if the VIF is 1, the 
predictor variables are uncorrelated. In our case, they are not exactly 1, 
as the tolerance was not exactly 

• R^: as before, the multiple correlation coefficient 

• Beta inserted ((3) : the standardized regression coefficient, i.e. the coef- 
ficients obtained if the variables were standardized with zero mean and 
unitary standard deviation before being used in the model. They differ 
from the 'B' coefficients, that could be affected by errors due to different 
behaviour of the related independent variables. 

• Partial correlation: the correlation between the dependent variable and 
the independent ones, taking into account the presence of other corre- 
lated variables. It can be interpreted as the percentage of non-explicated 
variability of Jj, j = 1, 2 due to a regressor after having 'subtracted' the 
contribution of the other regressors. 

• Semi-partial correlation: as the partial, but related to the total variance 
of/,. 
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• t, p: test t for these statistics and related p-value. 

Finally, the model coefficients come with the B and f3 values, with their stan- 
dard errors, the test t associated and the confidence intervals. 



When working with time series, raw residuals, i.e. the differences between the 
observed and the predicted values, are usually correlated and have a variance 
that changes, (see, e.g. page 342]). To test the serial auto-correlation of 
the raw residuals we have implemented the Durbin- Watson test[l6]. The test 
statistic is 



d 



e: 



N 



2(1 -p) 



(4.23) 



where p is the residual sample-autocorrelation at lag —1. The statistic d can 
assume values in the range [0, 4] and becomes smaller as the correlation in- 
creases. The statistic distribution is not known, but has been tabulated with 
experimental texts by Durbin & Watson. Two bounds, a lower and a upper, 
dependent from the number of residual samples A^, the number of regressors, 
without the intercept, and the confidence level, define two doubtful regions, 
where the test does not permit a decision, a central area of no autocorrela- 
tion, and two lateral areas where there is a statistical evidence of positive and 
negative serial correlation, respectively. Figure H.26I shows these areas. 



no 
dedsi«L 




negative 

auto 
correladai 



di du 2 4-du 4-di A 

Figure 4.26: Decisional areas for the Durbin- Watson test. In the figure, 
and du are the lower and the upper bound, respectively. 



The presence of a serial correlation modifies the properties of the coefficient 
estimators: they are still unbiased, but they are no longer the best estimators. 
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For our analysis purposes, since tabulated values foresee up to = 100, we 
have further divided the residuals series in subsets of 100 samples, and we have 
set the confidence level at 95%. 

To control the assumption of the regression model, several modification of 
the raw residuals have been proposed. We choose the Standardized residuals, 
corrected to equalize their variances. The standard residuals are evaluated by 
Statistica using the following formulation: 

ef= 1 = 1. ..N (4.24) 

In the following paragraphs, we proceed to the description and comparison of 
regression analysis using both a multiple linear model and a multiple linear 
model with a factor of higher order, respectively. 

4.8.3 Estimation of photometric coefficients through cal- 
ibration analysis 

First of all, we use the least squares regression to compute the coefficients [3ij 
of the photometric channels in the eq. 14.181 performing the analysis on calibra- 
tion records. These coefficients will be useful for comparison with successive 
analysis. 

The calibration data consists, as described in par. 14.21 in sets recorded with 
just one photometric channel fed with source ffux, while the other is void, and 
contains just background or environmental noise. The level of the interfero- 
metric channels gives immediately the coefficients of the interested photometric 
channel: 

Ji = /?i,aPA(x); h = P2APM^) 

h=(3^^BPB{x)- h=l32,BPB{x) (4.25) 
using respectively data of case 2 and case 3. 

It is clear that in this case we need the regression analysis with just one regres- 
sor, PA for case 2 and PB for case 3. In the next figures 04.271 and I4.28p . we 
can find a summary of the model properties, the coefficients values with their 
tests, and the residuals normal probability plots for case 2 (channel PA fed) 
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and case 3, respectively. We have to remark that the analysis has been done 
selecting only records with variance homogeneity properties on all channels for 
PB and on regressors for PA, since for the latter no ideal record was in the 
calibration set. 



Test 55 Modello Compl. vs. SS Residui (chA_PA-PBStationaryRecord) 



Dipend. , 
Variabile 


Multipio 
R 


Multipio 1 

R= 


Aggiust. 
R=^ 


SS 
Modello 


gi 

Modello 


MS 
Modello 


SS 
Residuo 


gi 

Residuo 


MS 
Residuo 


F 


P 


11 


0,9974071 


0,994820 0,994017 


7052591 


1 


7052591 


36720,56 


1577 


23,20508 


302000,3 


0,00 


12 


0,998022 


0,996049 0,996046 


18849208 


1 


18849208 


74777,38 


1577 


47,41749 


397515,9 


0,00 



Stime Parametri (chA_PA-PBStationaryRecord) 
Parametrizzazione sigma-ristretta 



Effetto 



11 

Param. 



Err.Std 



t 



JL 



-95,00% 
Lmt.Cnf 



-f95,00% 
Lmt.Cnf 



Beta (d] I Err. St. IS 



-95,00% 
Lmt.Cnf 



-f95,00% 
Lmt.Cnf 



PA 



0,79061110,001437 550,3456 0,00 0,787793 0,793429 0,997407 0,001012 0,993052 1,000962 



12 

Param. 



12 

Err. Std 



12 
t 



12 

JL 



-95,00% 
Lmt.Cnf 



-f95,00% 
Lmt.Cnf 



12 

Beta (d) 



12 

Err. St.G 



-95,00% 
Lmt.Cnf 



-f95,00% 
Lmt.Cnf 



1,292513 0,002050 630,4886,0,00 1,288492 1,296534 0,998022 0,001583 0,994917 1,001127 



Normal Prob. Plot; Residui slandardizzali Normal Prob. Plot; Residui slandardizzali 

Variabile dipendente: 11 Variabile dipendente: 12 

(Camp, analisi) (Camp, analisi) 




.01 



Residuo Residuo 

Figure 4.27: Case 2: channel A fed, channel PB void. First row, summary 
of the model; second row, the regression parameters with tests; third row, 
normal probability plots of the residuals for the dependent variables II (left) 
and 12 (right). 



The regression model explains very well the variability of the dependent vari- 
ables II and 12 for the regressor PA, for PB there is some more uncertainty, 
but it is still good. The analysis of the residuals shows their normal distri- 
bution for channel PA, whereas for PB the residuals have tails that do not 
respect normality. If we use all the records, the coefficients do not change 
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Test SS Modello Compl. vs. SS Residui (chB_StationaryRecord) 



Dipend. , 
Variabile 



Multipio 
R 



Multipio 



Aggiust. SS 
R= Modello 



Modello 



MS 
Modello 



SS 
Residuo 



gi 

Residuo 



MS 
Residuo 



0,982736 0,965771 0,965758 



0,984256 0,968759 0,968747 



7889397 1 7889397 279617,1 

2934960 1 2934960 94647,9 



2629 106,3587 74177,24 0,00 
2629 36,0015 81523,28 0,00 



Effetto 


Stime Parametri (chB_StationaryRecord) 
Parametriziazione sigma-ristretta 


11 

Param. 


11 

Err. Std 


11 

t 


11 
P 


-95,00% 
Lmt.Cnf 


495,00% 
Lmt.Cnf 


11 

Beta (13) 


11 

Err. St. 13 


-95,00% 
Lmt.Cnf 


-H95,D0% 
Lmt.Cnf 


PB 


3,379184^ 


.0,012407 


272,3550 0,00 


3,354855 


3,403513 


0,982736 


0,003608 


0,975661 


0,989812 




12 

Param. 


12 

Err. Std 


12 
t 


12 
P 


-95,00% 
Lmt.Cnf 


-h95,00% 
Lmt.Cnf 


12 

Beta (d) 


12 

Err. St.G 


-95,00% 
Lmt.Cnf 


-^95 ,00% 
Lmt.Cnf 




2,061062 


0,007219 


285,5228 0,00 


2,046907 


2,075216 


0,984256 


0,003447 


0,977496 


0,991015 
























Figure 4.28: Same as fig. I4.27[ but for case 3: channel A void, channel PB 
fed. 
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much, their standard errors decrease, due to the higher number of samples, 
but the residuals are worse. 

To better understand the behaviour of the residuals, we perform the Durbin- 
Watson test for the search of autocorrelation in the time series of the raw 
residuals. The results are shown in figure 14.291 The residuals have been 
divided in 100-samples sized intervals, and the test has been performed over 
each interval. 



in for (1 - PA channel 



NO BECISION 



15 



NO DECISION 



20 25 



NO DECISION 



20 25 



Figure 4.29: Durbin- Watson statistics for the residuals of II (left) and 12 
(right): first row, for the regressor PA, second row, for PB. 



We notice the curious feature of the residuals of the regression with the channel 
PA as regressor: they are uncorrelated. The only particular difi^erence between 
the two channels is that the fiux in PB is lower than PA, and is less subject to 
fiuctuations. So we can say that the regression can easily track the fiuctuation. 
On channel PB there is some different contribution: however, from both time 
and frequency statistical analysis we could not find anything particular. 
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4.8.4 Regression with linear model 

We now use the observational data to test the model of eq. 14.181 As said 
before, we limit the analysis over all records for which the variances of the 
photometric inputs and of the interferometric outputs are not varying along 
time over the record length. A typical example of the raw data is shown in fig. 
KM 




Figure 4.30: Record 2 raw data (left) and utilized data (right). The coher- 
ence length has been eliminated from the record to avoid variance variation 
caused by interferometric fringes. 



We use the following linear regression model without intercept: 

Ii = CAiPA + CBiPB, 1 = 1,2 (4.26) 

where the photometric channels PA and PB are the regression variables, and 
the interferometric outputs II and 12 are the dependent variables. This linear 
model is a good fit of the observed data; in figure 14.311 scatterplots of the 
predicted vs. observed values are shown. The points follow roughly a straight 
line; there are no evident outliers. 

The model gives a good explanation of the variance of the outputs II and 
/2, too. In figures 14.321 and 14.331 the summary table of the model and the 
coefficients values and tests are reported. We notice that the values are 
really high, close to 1, for both the dependent variables II and 12. They are 
marked in red, and the p-value is less than 0.01, so we can accept the results. 
The (5 coefficients values suggest that the division of the incoming beams PA 
and PB on the outputs II and 12 is not balanced, but has a proportion of 
about 33% against 65%. The regression coefficients are slightly different from 
those resulting from the regression of par. I4.8.3t in this case, the channel PB 
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Valori Osserv. vs. Previsti 
Variabile dipendenle: 11 

(Camp, analisi) 




Valori Osserv. vs. Previsti 
Variabile dipendenle: 12 
(Camp, anallsl) 
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Figure 4.31: Scatterplot of observed versus predicted values for II and 12 
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Figure 4.32: Table of tests on the model (first row) and statistics on the 
regressors (second row). 



is reduced with advantage of PA. This could be due to the interaction of 
the two channels, that in the simple regression with just one channel wasn't 
present. 

Even if the model utilized seems to fit very well the data, before validating 
our results we analyze the residuals, in order to check the linear regression as- 
sumption of normal distribution of the residuals, with zero mean and constant 
variance. 
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Stime Parametri (AII_PA-PB-l1-l2_StationaryData) 
Parametriiiaiione sigma-ristretta 
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Figure 4.33: Estimation and statistical tests of regression coefficients 



However, the analysis of the residuals shows that the residuals are not perfectly 
normally distributed, but they have long tails. The first row of figure 14.341 
reports the normal probability plots of the residuals for both II and 12. 
The magnitude of the residuals, plotted against time (i.e. case number) in the 
second row of figure HSU changes with time, first increasing and then decreas- 
ing. We can suspect an inhomogeneous variance of the residuals. 

The results of the Durbin- Watson test are shown in figure I4.35[ It is evident 
the presence of a positive correlation for a large number of sets. 
Even if the model seems very good, caution is in order, due to the residual 
distribution and by their changing magnitude. 

4.8.5 Regression with mixed model 

We repeat the same analysis than in the previous paragraph, using the same 
records of data, but with a different regression model, without intercept but 
with higher order effects: 

I, = CAiPA + CB^PB + CABiPA*PB, 1=1,2 (4.27) 
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Normal Prob. Plot; Residui slandardizzali 
Varlablle dipendente: 11 
(Camp, analisi) 
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Figure 4.34: First row: normal probability plots of the standardized resid- 
uals for channels II and 12. Second row: scatterplot of residuals versus 
number of cases for II (left) and 12 (right). 



Durbin-Walson test for residuals autocorrelation for 11 Durbin-Walson test for residuals autocorrelation for 12 




Figure 4.35: Durbin- Watson statistics for the residuals of II (left) and 12 
(right). 



where, again, the regressors are the photometric channels PA and PB and the 
dependent variables are the interferometric outputs II and 12. 
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The fit of tlie model to the data is very good, as shown in figure 14.361 by 
scatterplots of the predicted vs. observed values. Again, there are no evident 
outliers. 




Figure 4.36: Scatterplot of observed versus predicted values for II and 12 



Also this model gives a good explanation of the variance of the outputs II and 
12. We can see in figure H7371 first row, that the values are the same, just 
a little better for II. 

Finally, in figure 14.331 the B and (3 coefficients, with their standard errors, the 
test t associated and the confidence intervals are plotted. 
The correlation analysis of the independent variables (second row of figure 
I4.37P shows that, for this kind of analysis, the mixed term can not be ex- 
cluded, because the test of nullity has a p-value < 0.01. Of course its weight 
is reduced, being outside the coherence length, with respect to PA and PB, 
as we could expect. 

This term can be easily explained with the presence of the modulation function. 

We now execute the Durbin- Watson test to check for an autocorrelation of 
the raw residuals. Figure 14.381 shows the test results for the residuals of the 
dependent variables for a number of subsequent sets, each of them 100-samples 
sized. It is evident the presence of a positive correlation for a large number of 
sets. 

We have to look again at the standardized residuals in figure 14.391 We can 
notice, comparing with figure I4.34[ that the distribution of the residuals of 
the mixed model is closer to a normal one (first row) and that the magnitude 
of the residuals is more uniform (second row). We can conclude that in the 
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Test 55 Modello Compl. vs. 55 Residui (AII_PA-PB-l1-l2_StationaryData.sta) 
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Figure 4.37: First row, table of tests on the model; second row, correlation 
analysis for PA, PB and PA*PB; third row, estimation and statistical tests 
of regression model coefficients 
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Figure 4.38: Durbin- Watson statistics for the residuals of II (left) and 12 
(right). 



previous case the residuals contained the variability caused by the factor of 
higher order. 



V.M.lil, dip,nd,nt,:H 



V.M.lil, dip,nd,nt,:H 











■ 




■1 


■ 




f ° 


























































































: 















- — i 




i 






























































































































■ 
■ 









■6 -5 -4 -3 -2 -1 1 2 3 4 5 6 



■6 -5 -4 -3 -2 -1 3 1 2 3 4 5 6 



Figure 4.39: Normal probability plots for the standardized residuals for II 
(left) and for 12 (right) in the mixed linear model. 



This fact is in some way surprising because we have chosen data with homoge- 
nous variance. If this higher order term had a strong impact, data should not 
be homogenous, as shown by simulations. Hence, we can say that presence of 
noise makes data to be homogeneous! The faint coefficient of the mixed term 
gives a sort of ratio of modulation / noise. 
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4.9 Conclusions for regression analysis 

We are now able to give an answer to the questions we posed at the beginning 
of this section. 

First of all, from the analysis of the F? multiple coefficients of both models 
we can say that the variability on the inputs was able to explicate almost 
all the variability on outputs. There is of course a small part of variability 
unexplained. Physically, we can identify this quantity with instrumental con- 
tribution to the noise of the system. Its low magnitude means that the system 
does not add strong perturbations to the outputs. 

We could also give a quantitative estimation of this variability: less than 1%. 
Moreover, the characteristics of the residuals give us some information on this 
contribution. We can be confident that it is normally distributed, and its vari- 
ance contains the non-linearity of the combination process. To this variance 
we have to add the contribution given by the difference of the coefficients of 
the calibration analysis with respect to the analysis with both channels: we 
can think of it as a component due to interaction of the two input channels. 

However, there are some negative considerations to do. In fact, we could just 
use a part of our initial data (about 25% of the total data), to restrict the 
analysis to the records with homogeneous data. To enlarge the available data, 
it is necessary to tailor the regression, introducing weights. 
To use data with inhomogeneous variance on the regressors, so on PA and 
PB, we should have more information about the data nature, to be able to 
properly describe the distribution of the regressor random variable, and to 
identify measurement errors. 

Finally, the answer to the third question comes from the comparison of the 
linear and the mixed linear model. The residuals of the first one have some 
inhomogeneity on the variance that is explained by the latter. The presence 
of the higher order mixed factor means that the low-magnitude modulation 
outside the coherence length is not negligible. However, it is very small. The 
Levene test for the homogeneity of variance, applied to simulations of an ideal 
interferogram (see eq. 14. ip without noise, has given evidence of non homoge- 
neous variance for almost all cases (side lobes of the interferometric pattern). 
This means that the noise covers this patterns, at least in the considered 
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records, but it is still identifiable thanks to the higher order factor. 
Finally, we have to remark that the presence of serial correlation between 
residuals does not influence the bias of the estimators, but their variance: they 
are no longer the best estimators. This fact affects especially the estimation 
of the photometric coefficients, since they are used in the normalization of 
interferometric signals. Some authors (see, e.g., Rawlings) have proposed a 
prior transformation of the regression variables before performing the analysis; 
but Rawlings also says that it is always better to retain a good simple model, 
even in presence of inhomogeneity of variance or non-normality. 

4.10 Future improvements 

First of all, the vahdity of the analysis described in this chapter is till now 
limited to the data set considered for the tests. Now that a set of statistical 
instruments is identified and checked, it would be useful to extend the analysis 
to other data set, both from VLTl and from other interferometric instruments, 
to separate peculiar from general features. 

The analysis performed till now suffered from the lack of theoretical informa- 
tion on the handled signals, in particular on their noise statistics. As explained 
in the introductive paragraph, a stochastic model would solve many uncertain- 
ties based on the direct estimation of important features from data. As an 
example, we know that our data have measurement errors: 

— ^ ~t~ 

Xi = Xi + 5i (4.28) 

where Y and X are the measured dependent variables and regressors, Y and 
X are the true values, and e and 6 are the errors of the measure, and finally i 
ranges over the number of data samples. 
Hence we solve the regression model: 

y. = + Si (4.29) 

instead of: 

Yi = pXi + (4.30) 

where fi is the vector of the errors of the hidden regression model. We have 
mentioned that, following Draper & Smith, the joint moments of the random 
variable Xi and Si can be used to correct the estimators of the coefficients. 



Statistical analysis of interferometric data 



151 



This is an important issue especially with respect to the determination of the 
coefficients of normalization for signals using the photometric information (as 
we have seen for FINITO fringe tracker, see chap. [2]). 

Preliminary works in this direction showed that the data are not easy to un- 
derstand. The presence of the trend, the form of the autocorrelation of photo- 
metric data (see par. 14. 3p seems to suggest a process with a memory. To refine 
the field of this kind of process, we have considered the partial autocorrela- 
tion function: the correlation at each lag is purified from the contribution of 
precedent lags. Figure 14.401 shows the first lags [17] for the observational case 
4. It is clear that the partial autocorrelation points decreases exponentially to 
zero. It could be a moving average model, as well as an autoregressive one, or 
the composition of both. It is an intermediate situation, that needs a careful 
analysis. 

Statistical tools and software, such as Statistica, can be of help in this research. 
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Figure 4.40: Partial autocorrelation points for interferometric channels 
(first row) and photometric channels (second row) when both arms are fed 
with flux, ft can be seen that the PACF decreases exponentially to zero. 



Appendix A 



Bias in the power spectral 
density function due to a trend 
removal 

The scope of this section is the evaluation of the bias introduced on the power 
spectral density function (PSD) by the subtraction of an estimated linear trend. 
We first introduce the statistical estimators for the sample moments used in 
the next paragraphs, with their properties. Then we proceed to the estimation 
on the bias. Following ^7\, we will limit our analysis to wide sense stationary 
processes, in order to take advantage of the relationship between the autocor- 
relation function and the PSD. We first recall the properties of the spectral 
density function of a signal with zero mean, as proposed by Manolakis|37j (par. 
IA.2.ip . then we consider the case of a signal with a trend. First we suppose 
the presence of a non-zero mean flA.2.2p . then of a general trend (par. IA.2.3p . 

A.l Statistical estimators 

We choose, as estimator of the mean, the variance and the autocorrelation 
function the correspondent time-sample estimators. We use the notation time- 
sample to highlight the fact that we estimate them directly from a set of 
subsequent samples in time, instead of a set of realizations. Their properties 
are known in literature, see, e.g., Priestley [35]. We report here definition and 
properties, following the notation of In the following, we will refer to 

signals at least stationary in the wide sense, i.e. such that the moments up to 
order two are not dependent on time. 
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Time-sample mean 

The time-samples meaia, evaluated over the set of samples {X{k), k = 1, . . . , N}, 
is an estimator of the mean, supposed to be constant: 



1 ^ 



(A.l) 



k=l 



This estimator is unbiased, since E[fix,N] = jf ^ E[X (k)] = /ix- If the samples 
X{k) are uncorrelated, it is also asymptotically consistent: 
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But if the samples are correlated, the variance becomes: 
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where 7x(0 — cov{X{h),X{h + I)); setting r — h — k, we obtain: 
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We have, if N goes to infinity: 



2 +00 



(A.5) 



for the limiting form of the function g(r) = 1 — jjr. If X is such that its 
autocovariance function ■yx possesses a Fourier Transform f{w), we find that 

(7|27r 
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(A.6) 



So this estimator for the time mean is unbiased and asymptotically consistent. 
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Time-sample variance 

For the estimation of the variance of X{k), we use the time-samples variance 
with unknown mean fix,N- 



k=0 



N 



(A.7) 



If the samples are uncorrelated, this estimator is unbiased, while 



(A.8) 
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is biased. Moreover, it is asymptotically consistent. 
If the samples are correlated, we obtain: 
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having added and subtracted the quantity nx- So the expectation becomes: 
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As the variance of the time mean is asymptotically consistent, the expectation 
of the time variance is biased, but asymptotically unbiased. 

The variance of this estimator will be given in next paragraph as a particular 
case of the covariance estimator. 

Time-sample autocovairiance function 

For the autocovariance function, we use the time-samples autocovariance func- 
tion 7x(0) with X real. As we will use it later, we distinguish the cases in 
which the mean is known or unknown. 

1. known mean /x: 




if < / < - 1 
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otherwise 
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otherwise 
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< [N — 1), the expectation becomes: 
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n=l 



(A.12) 



and it is null otherwise, so this estimator is unbiased. 



2. unknown mean n, estimated with /t 
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Adding and subtracting /i,for 1 < |/| < (A^ — 1) we have: 



E[l. 



+ 



1 








1 


A^ 






1 



A^- / 



E 



E 



E 



N-\l\ 

^[(X(n + /)-/x)(X(n)-/x)] + 

n=l 

N^\l\ 



n=l 



+ 



N-\l\ 
n=l 



(A.14) 



If we approximate the sum till A^ — / with the analogous till A^ we get: 
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n=l k=l \ 

So, the expectation can be approximated with: 
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If A^ tends to oo, the fraction 



JV+|i| 
-N-\l\ 



tends to —1. Moreover, from the 



properties of the time mean, we know that tends to zero if A^ tends 
to oo, so the expectation of 7x(0 tends to 7x(0 if A^ tends to oo. 
Note that the result is underestimated for all lags. 



When / = this expression is exact and reduces to: 

i5[7x(0)]=7(0)-4 
in accordance with eq. lA.lOl 



Exact expression for the covariance of this estimate have been found by 
Bartlett [35], p. 326] if the random process is stationary up to order four, 
but he also gives an approximated formula: 

COY x{l), 1 x{l+h)} ^ — ^ {lx{m)'^x{m+h)+'-fx{m+l+h)'-fx{'m-l)} 



m= 



(A.16) 
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from which: 



var{7x(/)}^^ J2 {ixM + lx{m + l)-fx{m - I)} (A.17) 

m=— oo 

From the last equation we can also deduce the variance of the time- 
sample variance: 

var{o-x,^} ^ ^ E ^xim) ^^^^ ^/(O) (A.18) 

m=— oo 

The variance tends to zero asymptotically. 

Before proceeding, we look at the properties of the following estimator for the 
autocovariance, as we mention it in chapter HJ 

^^^i) = l iv3^Elf[^(^ + 0-/i][^H-/i] ifo<|/|<iv-i 

1 otherwise 

(A.19) 

with k < 1. Following the same procedure of eq. IA.151 we can see that this 
estimator is biased even if the mean is known: 



N- 
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-k 



However, it is asymptotically unbiased. We notice that it overestimates the 
autocovariance of X. If the mean is unknown, we can obtain the same result 
as eq. IA.151 which is an approximation. 

Time-sample autocorrelation function 

The autocorrelation function can be estimated by the time sample autocorre- 
lation function: 

Px(0 = ^ (A.20) 
^x 

If we assume the variance as known, the properties of px(0 '^^^ be found in 
a straightforward way from those of the time-sample autocovariance function, 
simply dividing for a\. 



But if the variance is to be estimated, the computation is complicated by 
the expectation of the ratio of two random variables. However, Kendall 
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has provided an approximation to order for the expectation of the ratio 
between random variables in quadratic form. More precisely, if A, B, and C 
are the r.v., a, b and c the deviation of A, B, and C from their respective 



means, and r is the ratio r = we have: 



^ E[A] r _ lEjab] _ lEjac] lEjbc] 
^""^ y/E[B]E[C] I E[A]E[C]^ E[B]E[Cy 



lE[b^] 'lE\c^ 



which reduces, if i? = C, to: 



Now, 



and 



E[A] r lEjab] E^] 
^^^^ ~ E[B] Y E[A]E[B] + E^[B] ^ ^^'^^^ 



E[ab] = E[{A - fiA){B - /i^)] = cov(A, B) (A.23) 
E[b'^] = E[{B - fiB?] = var(5). (A.24) 



We apply this result in our case. Then, Ai is the estimator of the autocovari- 

ance function 7(/) defined in eq. IA.131 while B is the estimator of the variance 

(see eq. IA.7|) . Both are quadratic in the Xr^ti) variables. 

We have already evaluated E[Ai] = E[^xil)] (eq. IXlSD . E[B] = E[aj^^j^] 

(eq. [XTOj) . E[b^] = varfo-^] (eq. IXTSl) : we still need the term 

cov{Ai,B) = cov(7x(/), 7x(0)). We can substitute in eq. lA. 161 with h = —I to 

find: 

^ +00 

cov(7x(/),7x(0)) ~ ^ 5^ [iximhxim - l)]^ (A.25) 

m=— 00 

Substituting all these equations in eq. IA.22t we obtain: 

(A.26) 
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where we have set: 



a 




m=— oo 



s 



Yl [px{m)px{m-l)]'^ 



(A.27) 



The expectation is a quadratic form in the function. 

A. 2 Bias in the estimation of the power spec- 
tral density of stationary processes 

A random process is said to be stationary if its moments do not depend from 
the time: for example, the mean and variance are constant, the covariance 
depends just from the lag and so on. 

For these processes, a crucial relationship holds, linking the spectral density 
function and the autocorrelation function under appropriate conditions: if 
{X{t)} is a zero-mean continuous parameter stationary process with (non nor- 
malized) power spectral density function h{u) existing for all w, and autoco- 
variance function R{t), then 



A proof can be found, e.g., in Priestley [35 [ p. 211]. A similar relation exists 



function p{t). The function f{uj) is important because it has the properties of 
a probability density function, so it makes a connection between probability 
distribution of the process X and its spectral density. 

A. 2.1 Bias on the PSD in presence of a zero mean 

We can use, following e.g. ManolakisPTl p. 210], this relationship to estimate 
the bias on the PSD for a stationary zero-mean signal {X{n)}n>o- 




(A.28) 



for the normalized power spectral density f{uj) 



and the autocorrelation 
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We begin estimating the autocovariance through the sample autocovariance 
function: 



jjEn=t'^in + l)X*{n) ifO</<iV-l 



rx{l) = { r\{-l) 




if -(n - 1) < / < (A.29) 
otherwise 



The estimation over a finite number of samples is equivalent to the multipli- 
cation of the original samples sequence with the rectangular window 



WN{k) 



1 ifO<fc<iV-l 
otherwise 



(A.30) 



The introduction of the window induces a bias on the estimation, and we can 
estimate it. 



E[fx{ 
If / > 0, we have: 



E 



^ N-l-l 

- J2 X{n + l)X*i 



N 



n] 



n=0 



|/| > 



(A.31) 



E 



[n]w[n] 



- X{n + l)w{n + l)X 
while, if / < 0, E[fx{l)] = E[f*x{-1)]. We then obtain: 

E[rxm = ]^ E E[X{n + l)X*{n)]w{n + l) 

n=— oo 

. N-\l 



(A.32) 



win) 



N 



win] 



■r{l) 



(A.33) 



because Yl'^=-oo ""^(^ + 0^1 



n] 



iV-|/| if|/|<Ar-l 
otherwise 



So, if the window is rectangular, the estimation is asymptotically unbiased. 
We recall the estimation of the covariance of of ^l- IA.16[ 



cov 



^ +00 

{fx(/), rx(/ + /i)} ~ — E {rx{m)rx{m + h)+rx{m + l + h)rx{m~l)] 

{AM) 



m=— oo 
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The covariance is smaU just if the lag / is small compared to N, and successive 
values of f{l) could be correlated. 

The power spectrum of this kind of processes can be evaluated as the Fourier 
transform of the autocorrelation function: 

oo 

i?^(e™) = J2 rx{l)e-''"^ (A.35) 

i=— oo 

We estimate it with the periodogram and the sample autocorrelation function: 

N-l 

Rx{e'n= Yl ^^(Oe-™' (A.36) 

l=-(N-l) 

SO the mean and the variance of this estimator depends from those of the 
autocorrelation functions: 

^-1 AT -I/I 

E[Rx{en]= E E[fx{l)]e-'-' = ^^^^(Oe"™' (A.37) 

Hence, if the window is rectangular (no weights are added to the samples) , the 
periodogram is an asymptotically unbiased estimation of the power spectrum. 
The bias of the estimator depends on the window chosen: if the window is not 
rectangular, we will have the correlation function of the window instead of the 
term {N - \l\)/N. 

The variance does not tend to zero as the window increases. An approxi- 
mate expression for the covariance cov ^^Rx{e'^^^), Rxi^^^^)^ has been found 

by Jenkins & Watts, and it is function of both and Rx{e™'^), so the 

variance is of order of R\{e™). 

A. 2. 2 Bias on the PSD in presence of a non-zero mean 

We now consider a X{k) signal with a trend: 

X{k) = Xs{k) +a{k), k>0 (A.38) 

We relax the hypothesis of sec. IA.2t and we ask Xg to be wide sense stationary, 
i.e., stationary up to order 2: 
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1. iiXs — E[Xs{k)] — IX, Vfc. We assume hereafter that // = 

2. E[Xs{k + n)X*(A:)] = rx^(n) V/c, i.e. it depends just on the separation n 
between the samples 

3. var(X,(A;)) = 

because for the bias on the PSD we use moments up to order 2. We ask, 
however, that Xg has spectral density given by the Fourier transform of the 
autocovariance function: 

fxs{^)^^ E PXs{3)e'^'''' (A.39) 

j=-oo 

so that: 

J=-0O 

The simplest case is the presence of a non-zero mean: a{k) — a k > 0, with a 
constant: 

X{k) = Xs{k) + a, k>0 (A.41) 
The properties of the signal X{k) depend from those of Xs{k): 

1. ixx = E[X{k)] = i_i + a = a 

2. ax — veir{Xs{k) + a) — 

3. ^x{l)^E[{X{k)-i,x){X{k + l)-i,x)]^E[Xs{k)X,{k + l)]^rxXl).^k 

4. px{l) = ^ = VA; 



11 1 .1 , 1 if < A; < AT - 1 

We apply the rectangular wmdow WN\k) = < , to the 



Autocorrelation properties 

How Uli^rfk.^ = < 

otherwise 
signal X{k): 

Xn{k) = X{k)wN{k) 

Although the new signal is different from {X{k)} because it is zero outside the 
window domain, applying such a window does not change the expectation and 
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the variance of the set of random variables {Xfi{k)}i<k^]\f. Of course, if we 
evaluate the covariance function to the signal we get a different function: 

7x«(fc, /) = E[Xn{k)Xn{k + I)] = E[X{k)wR{k)Xn{k + 1)wr{1 + /)] = 

= E[X{k)XR{k + l)]wR{k)wn{l + l) (A.42) 

because 

E[X{k)wR{k)XR{k + 1)wr{1 + I)] = 

oo 

= Yl [X{k)]it)wR{k)[X{k + l)]{t)wR{k + l)p{x(k),x(k+i)}it) = 

t=—oo 

oo 

= w;ij(A;)t/7H(A; + /) 5^ [X(A;)](t)[X(A; + /)](%x(fc),x(fc+z)}(t) (A.43) 



t=—oo 



where P{x(k),x{k+i)} is the joint probability density function of X{k) and 
X{k + 1). So the autocovariance function changes its value depending on the 
window function wr. This signal is no longer w.s.s. Moreover, we have that 

This effect of the application of the window is avoided by the sample autoco- 
variance 7x^(/): 

f jhEn:l[XRin + l)-fix,N][XUn)-f^x,N] ifO</<iV-l 
7x,(0= < ihH) if-(iV-l)</<0 
I otherwise 

(A.44) 

We are dealing with real valued signals, so we have: 

^l) = l Nk\ ^n=i^i^Rin + 1)- Ax,^][XK(n) - f,x,N] if < \l\ < N - 1 
^ 1 otherwise 

(A.45) 

In this definition, we have required explicitly the symmetry, and we have forced 
the autocovariance to be null outside the lag interval [— (A^ — 1), — 1]. 



With this estimator for the covariance, we first assume that the mean is known. 
As seen in sec. IA.lt we have that the estimator is unbiased. This result can 
be obtained in a slightly different way remembering the particular form of the 
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signal, using the window function: 



N - \l\ 



N - 

+00 



rE 



N-\l\ 



^ [XR{n + 1) - ^x,n] [Xnin) - hx,n] 



n=l 



^ [X{n + 1) - HxM [Xin) - fix,N]^N{n)wNin + I) 



4A.46) 



There are not convergence problems transforming this finite sum into an infi- 
nite one, because we add only null terms. 
2 +00 

— — ^ E[{X{n + l) - fix,N){X{n) ~ fix,N)]wN{n)wN{n + l) = 



N- I 



1 



7x(0 WR{n)wR{n + l) (A.47) 

n=—oo 



N-\l 

Now, for the rectangular window wn, the infinite sum results 

00 

n=— 00 

So the expectation becomes: 



N-\l\ ifl<|/|<A^ 
otherwise 



N- 



N-\l 



■lx{l) = 7x(0- 



(A.48) 



(A.49) 



and we find that in this case the estimator is unbiased, in agreement with eq. 

EH 



If the mean is unknown, we have from eq. IA.15I that the estimator of the 
autocovariance function is biased, but asymptotically unbiased. We can find 
again this results, even with the truncated signal, using the properties of the 
window function wjy. 

^ N~\l\ 

E[^xM = Elw^l E [^^(^ + ^) - f^XnMlXRin) - fiXnM] = (A.50) 



n=l 



1 



A^- |Z| 
'n-\i\ 



■ E 



n=l 
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Recalling that Xji{k) — X{k)wN{k) and decomposing the summation in all its 
terms: 



N- 



N-\l\ 



n=l 



N-\l\ N-\l\ 



n=l 



n=l 



N-\l\ 



n=l 



(A.51) 



The expectation of this sum is the sum of the expectations. The first terms 
becomes: 



N 




1^1 




1 




N 




|/| 




1 




N 




|/| 




1 




N 







E 



E 



N-\l\ 



n=l 

+00 

[X{n + - AfXHlf^l?^) - fJ'XR]wN{n + l)wN{n) 



_n=—oo 

+0O 



+ 00 



(A.52) 



where we have used the fact that fix^, = f^x = We know that the infinite 
sum of the rectangular window sums up to — |/|, so we obtain: 



N - 



7x(/)=7x(/). 



N-\l\ 

The last term is the variance of the random variable fiXa^N 



(A.53) 



1 



N 



:E 



N-\l\ 

Y, \.i^XR,N ~ ^^r\ [I^^R,N - l^'Xi, 



n=l 



{AM) 



1 N-\l\ 



A^- 1/ 



n=l 
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and we know from eq. lA.SI that it is equal to: 



2 



a 



Xr 

N 



N-l 



(A.55) 



because is such that it possesses a Fourier Transform (it has only a finite 
number of non-zero values, so the coefficient integrals exist and are finite). 

The mid term gives: 



N 



N-\l\ 



n=l 



N-\l\ 



:E 



N-\l\ 



n=l 



(A.56) 



because fixR,N — fJ'XR does not depend from the summation index n. 

If we do not consider the summation up to A^ — |/|, but up to A^, we can 

conclude: 



N 



N-\l 



l^Xf 



N 



N-\l\ 



2 



(A.57) 



So the total contribution of the mid terms is ^ -kt^^ r, 

N-\l\ I^Xr,n 

Adding up all these terms, we find that the expectation of the estimator of the 
autocovariance coefficient 7x(0 is given by the corresponding "true" autoco- 
variance coefficient 7x(0 with the contribution of the error in the estimation 
of the media, and a third factor catching the correlation between the random 
variables fixR and the set of, in general, correlated {X{k)}: 



E[^xM^lx{l) 



2N 



N 



_ 2 , 2 _ 



(A.58) 



If A^ tends to oo, o"? tends to zero if A^ tends to oo, so the expectation of 7x(0 
tends to 7x(0 if ^ tends to oo, giving an asymptotically unbiased estimate. 
We notice that the denominator of this expectation is not a problem: the lag 
/ is fixed, while A^ grows to cxd: N ^ I. 
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Examples 

To illustrate the behavior of the expectation of the function 7x(0 when N is 
varying, we consider two random processes that have stationarity properties: 
the moving average process of order I MA (I) and the harmonic process H. 
The moving average process of order / is defined as: 

Xt = boEt + bi£t-i + ••• + bi£t-i (A.59) 

where the weights {&i}o<i<« are constant and the {ei}o<i<i are normal dis- 
tributed random variables: Ei ~ N{fii;,a'^). We can suppose, without loss of 
generality, that Ei ~ A^(0, 1), and we obtain: 



^xir)=E[Xt,Xt+r] 



aliPobr + biK+i + ... + bi_rbi) 0<r<l 
r> I 



lx{-r) = 7x(r), 

and so it does not depend on t 

In the particular case when all the weights are equal, h = the previous 

functions simplify: 



E[Xt] = 
7x(|r|) = 



2Hii±i n < Irl < / 

(«+l)2 u ^ |r| ^ f 

\r\>l 



• cri = 7x(0) = 

In picture lA.il the autocovariance function and the expectation of its estimator 
for different N are represented, for a MA(5) with equal weights (left), and for 
a general MA(4) process, with weights b = [0.9 0.85 0.8 0.5 0.1]. In both cases 
the underlying random variables have normal distribution A^(0, 1). 
The harmonic process is defined as: 

K 

Xt = Y, AiCos{wit + (Pi) (A.60) 

i=l 

with {Ai, Wi}i<i<K and K constant, and {4>i}i<i<K a family of random vari- 
ables i.i.d. with rectangular distribution over the interval [— 7r,7r]. Thanks to 
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MA(5):7^(l) and its approximation for different N [f0.f5...45,50j 



f^A{4): y (1) and its approximation for different N [10,f5...45,50] 




-5 -4 -3 -2 



Figure A.l: Expectation of jxil) at different N for MA(5) with equal 
weights (left), and in a general case (right) 

the properties of the {(f)i}i<i<K family, it is a stationary process V{y4j, Wi}i<i<K, 
WK and Vt: 



E[Xt] = 



7x('^) = E[Xt, Xt^r] = J2f=i l^iCos(wjr), and so it does not depend on 
t, but it never dies out 



-l = 7x(0) = Ef=i|^^ 



In figure IA.21 we show the autocovariance function and the expectation of its 
estimator at different N, for the harmonic process: 



10 



Xt = J^O.lcos(O.5t + 0i) 



(A.61) 



i=l 



where (pi, i = 1, . . . , 10 is a family of uncorrelated gaussian random variables 



Error of the approximation 



The error in the approximation of eq. IA.57I is larger for larger lags because we 
are adding more extra values in the summation. For / = 0, in particular, the 
approximating formula lA.58l is exact, and it becomes: 

2N 



(A.62) 
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MA{10):y (I) and its approximation for ditterent N [10,15...45,50] 




Figure A. 2: Expectation of 7x(0 f^'' harmonic process at different N 



If / 7^ 0, we have: 



N-\l 
1 



■E 



N 



N- 
1 



:E 



.n=l 



iv- 1/1 



n=l 

N 



+ 



n=N-\l\+l 



The last term is the error function: 



err {I) 



:E 



N 



n=N-\l\+l 



(A.63) 



(A.64) 



Estimation properties of the power spectral density 

We now estimate the (non normalized) power spectrum density using the es- 
timate of the autocovariance function in eq. IA.36[ 



Af-l 



Rxie' 



1=-{N-1) 



■iwl 



(A.65) 
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The expectation of this estimate depends from the expectation of the autoco- 
variance function: 

= E ^[7x,(0]e-™'= (A.66) 

1=-{N-1) 

1=-{N-1) ' ' r=-{N-l) ^ ^ j 

Af-1 

/=-{Af-l) 

^ N-\l\N ^ I AtP^^^^ 

Z=-(Af-l) ' ' r=-(Af-l) ^ ^ 

The behavior of this sum is different depending on whether the underlying 
random process has a non-periodic or periodic autocorrelation function. 
In the first case, assuming that the values 7(/) are negligible for large /, than 
the first factor of the sum tends to the true periodogram value for — > +oo: 

Af-1 +00 

E 7x.(0e^™' ^ 7x.(/)e-^'"' = i^xje™). (A.67) 

Z=-(Af-l) /=-oo 

If the autocorrelation function is periodic, this is no longer true. 
The last term of equation IA.66I does not converge to zero if N tends to +cxo. 
The inner sum Ef=-(7v-i) (l - ^) P^fl(^) tends to E^^ooP^aX'^) = 2vr/(0) 
if N tends to +cxd, as we have seen before. 

We illustrate the behavior of this error term for the processes we considered 
before in figures IA.3I and IA.4[ The error shape changes with the number of 
non-zero autocorrelation coefficients. In the MA(1) case, this depends from the 
order of the process, whereas for harmonic process, where the autocorrelation 
functions never dies out, the truncation of the autocorrelation function is a 
computational needs. 

Application to uncorrelated samples 

If the underlying process is a temporal sequence of uncorrelated random vari- 
ables, the estimation of the bias simplify. However, no changes of its properties 



E 
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MA(5): error for the periodogram for different N [10,1 5. ..45, 50] 



rvlA(IO): error for the periodogram for different N [10, 15. ..45,50] 




N = 20 
N-25 
N = 30 
N = 35 
N = 40 
N = 45 
N = 50 
N = 55 




Figure A. 3: Error for the expectation of periodogram at different N for 
MA(5) with equal weights (left), and for MA(IO) with equal weights (right) 




Figure A. 4: Error for the expectation of periodogram at different N for 
harmonic process with equal frequencies; number of lags considered for au- 
tocovariance function: 21 (left), and 41 (right) 
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arise, since the variance of the time-sample mean of an uncorrelated process 
X^{k), A; > is biased, but asymptotically unbiased (see eq. IA.2p . The Xu{k) 
autocovariance function is zero for all lags I 7^ 0: 



so the bias of its estimator will be: 

7x„(0 = 



al if / = 
otherwise 



(A.68) 



N+\l\ 



N{N-\l\)'^Xu 

Substituting into eq. IA.65t we finally obtain: 

N-l 



if 1 = 
otherwise 



(A.69) 



l=-iN-l) 
N-l 

l=-(N-l) l=-(N-l) 



N~l 



N + 


l 


1 


N- 


I 


N 



-iwl 



2 -iwl 

axe 



(A.70) 



Figure IA.5I shows the behaviour of the error on the estimated PSD of an 
uncorrelated process with unitary variance for different N. We can notice that 
for small N (~ 10) the error is relevant, while after a certain window width 
(~ 50 samples) there are not important improvements. 



lunclion for a white noise with = 1 




1 2 3 4 5 6 7 

frequency (rad) 

Figure A. 5: Error for the expectation of periodogram at different N for un 
uncorrelated process 
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A. 2. 3 Subtraction of a trend estimated with regression 

If the trend is not a constant, but has some kind of a functional form, we first 
need to estimate it, and one technique is the regression method. In this section 
we will follow the procedure of E. J. Hannan[39], and we will use his results. 
Under some conditions on the regressors, it can be shown (Grenander-1954) 
that the least squares estimate of the regression parameters has, asymptoti- 
cally, the same covariance matrix as the best linear unbiased estimate. 
Let us consider a regression: 

y = X/5 + e (A.71) 

where y is a n elements vector of dependent variables, X is a (n x p) matrix 
of regressors, (3 is the p-dimension vector of regression coefficients and e is the 
n elements vector of residuals. We limit the analysis to the case where X is 
fixed, or at least where e and X are independent. We further assume that e 
is a real valued stationary random process with continuous spectral function, 
with spectral density fe{oj). We want the X matrix with properties assuring a 
smart behaviour: 

(i) lim„^+oo = oo, j = l...p 

(ii) lim„^+oo Yl^=i Xj{tf/ Eii Xjity = 1' J = 1 • • 
3 limAr_+oo rj^k{h) = Pj,k{h) j,k = l...p, where 

Eii Xj{t)xk{t + h) 



m 



(h) 



is the 'sample autocorrelation value' 

(iv) If we extend the definition of pj^k{h), setting x{t) = 0, Vt ^ [1)^] we 
obtain a matrix {p x p) Rjfc(/i), V/i G |N, j,k = 1 . . .p. We ask R(0) to 
be non singular. 

These requirements assure that the regressors can increase without upper 
bound (i) but with a slow rate (ii), that they are not linear dependent (iv) 
and that it is possible to define a correlation function as limit of the samples 
time averages (iii). Under these assumptions, let e be the residuals of the es- 
timation of the (3 coefficients through least square regression. The PSD of the 
detrended signal will be the PSD of the e residuals: 

N 



t=l 



(A.72) 
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Hannan showed that the bias on this estimate is asymptotically given by: 

lim i?[i?(e-)]=2vr/,M- lim [ 1 - 1 ^i?^(e'*-) ) (A.73) 

W— >+oo W— >+oo \ iV I 



where i?^ is the PSD of the function (j)^{uj): 



N 

t=i 



(A.74) 



defined, in matrix form, by: $ = XP, where P is such that X'P'PX = I. 

This result is very useful because it gives an expression for the bias on the 
residuals that is a function of the properties of the regressors and of the pro- 
cess e. 



A. 2. 4 Application to stochastic processes with a Unear 
trend 

Our aim is the subtraction of a linear trend. 

A polynomial function agrees the assumption (i)-(iv): as t increases to oo, the 
sum of the square values of x(t) goes to oo; for the linearity, the limit in (ii) 
becomes: 

lim = ^l^f±}L = i^ J = l...p (A.75) 

For the same reason, and for the finite sum at the numerator against the in- 
creasing sum at the denominator in assumption (iii), the lim„^+oo fj,k{h) exist, 
and it is zero for all h > N. Note that for h = 0, rjk{h) does not depend on 
n, since we have assumed Xj{t) = 0, \/ t > N, j = 1 . . .p. We can conclude 
that also assumption (iv) is satisfied. 

Let us consider the composed signal of eq. IA.38[ 

p 

X(fc) =X,(A;) + A; > (A.76) 

where has the same features as in Sec. IA.2.21 Then is the e factor 
of eq. IA.71I We subtract an estimate of the trend Yl^=o ^i-^^ through least 
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square regression analysis. The PSD of the resulting signal is the PSD of the 
regression residuals. 

Let assume linear regressor of order p = 1: 

X/5 = ao + aix (A. 77) 

To apply the result of Hannan, we have to find a matrix P such that X'P'PX = 
I. We limit our analysis at the case x{t) G [—1, 1]. In this situation, we can 
find several orthonormal basis, otherwise, we should search for approximate 
solutions. We choose the Legendre polynomials (see, e.g., [50]): 

T 



The P matrix is: 



r 







V2ao 

^ 



(A.78) 
(A.79) 



The bias on the PSD of the Xg signal will be, according to eq. IA.73[ 



27r/,(^) 



1 — lim — 



N 



ituj 



t=l 



+ 



N 



itio 



t=l 



(A.80) 



The finite summations can be written as: 



1 

N 



N 



t=l 



1 / sin feu 



N/2) \ 



2N \ sin(t<;/2) J 



^N^+oo 0, id kn, k = 0, ±1, ±2, . . 

(A.81) 
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2iV 



N 
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t cos(u;t) + i t sin(u;t) 



t=i 



t=i 



3 

2iV 



. , sin iVcu cos^cu 
^ ' 4sin2 cj/2 2sin cj/2 



+ 



\r /^^ ^ 1 " COsfA^C^) N siXi{^Uj) 

Ncos{Nuj) - . . ,J + 



4sin2(cu/2) 2sin(cj/2) 



(A.82) 
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where we have used the relations 

N-l 



t sin (cut) 



p. 6 

sin(A^ti;) 



cosi 



■2Af-l 



UJ 



t=l 



4sin^(cj/2) 2sin(cj/2) 



7V-1 



t COs(co't) 



1 - cos{Nuj) N sin 



■2iV"l 



+ 



t=l 



4sin2(w/2) 2sin(tu/2) 



(A.83) 



There are not convergence problems if u ^ kn, k = 0, ±1, ±2, . . .. The latter 
are discontinuity points for which the limit goes to oo independently from A^. 
For all other u, however, for N ^ oo the limit does not converge, as we obtain: 



3 

2iV 



N 



itui 



t=l 



—N^ ( 1 + 
2N ' 



1 



sin(2A^u; - u /2) 



4sin^(cj/2) 2sin(cj/2) 



From the last equation arises the need of an appropriate windowing of the 
detrended data for the bias compensation. 



Application to uncorrelated samples 

Again, we apply the result to samples whose residuals are uncorrelated, since 
this seems to be the case of the residuals of the data we analyze in chap. HI sec. 
14.31 after the detrending operation. In this case, the spectral density function 
is 

2 

/,M = ^, Vo; (A.84) 

Let al = l. 

We apply eq. IA.73I at different window sizes. Figure IA.6I shows the behaviour 
of the expectation of the estimation through the periodogram for increasing 
A^. The vertical axis is in logarithmic scale. The estimation goes to oo at 
the extreme values of the frequencies interval: these are the points for which 
sin(c<;/2) 0. 

Note that if we subtract just a constant, the PSD is asymptotically unbiased, 
since the correction limit tends to zero as A^ tends to oo. This completes the 
results of Section IA.2.21 
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Figure A. 6: Error for the expectation of periodogram at different N for un 
uncorrelated process 



Appendix B 

Graphics omitted in chapter 4 



In this appendix are reported the graphics that we did not insert in chapter H] 
for ease of reading. 



B.l Statistical analysis in the time domain 



B.1.1 Void channels 



5 0.6 
I 0,4- 



Photometry B: autocorrelograi 



I nterlero metric channel 2: autocorrelogra 
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Figure B.l: Case 1: Autocorrelation function estimate for photometric 
channel PB (left) and interferometric channel 12 (right). 
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Graphics omitted in chapter 4 



Photo melric channel B and inlerferofnetric 



Photo melric channel B and inlerferofnetric 12 
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Figure B.2: Case 1: cross-correlogram between inputs and outputs - PB 
and n (left) and PB and 12 (right) 



B.1.2 Input: photometric signals 



Photometry B - dettended - autocorrelogram Photometry B - dettended - autocorrelogram 

1.2 1 1 1 1 1 1 1 1.2 1 1 1 1 1 




^00 -400 -200 200 400 600 -600 -400 -200 200 400 600 

lag lag 

Figure B.3: Case 4: raw data, autocorrelation functions for photometric 
channel B. Left, the linear trend to subtract is evaluated as a piecewise 
polynomial with breakpoints every 50 samples; right, breakpoints are every 
10 samples. 
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Pholometric channels A&B 



Pholomatnc channel A (deli 
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Figure B.4: Case 2: Cross-correlation functions for photometric channel A 
and B. Left, raw data; right, linear trend subtracted. 



B.1.3 Output in calibration mode 




Figure B.5: Case 2: Autocorrelation functions for interferometric channel 
12: raw data (left) and detrended (right). 
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Figure B.7: Case 3: Cross-correlation functions for intcrfcromctric channel 
II and 12: raw data (left) and detrended (right). 



Graphics omitted in chapter 4 



183 



B.1.4 Output in observational mode: Interferometric 
signals 




B.1.5 Cross-correlation between photometric inputs and 
interferometric outputs 



Interferomelnc channel 11 and photometnc PB - crass correlation Interferometric channel II and photometnc PB - detrended 
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Figure B.9: Case 4. Cross-correlation functions for interferometric II and 
photometric PB channels. Left, raw data; right, linear trend subtracted. 
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Graphics omitted in chapter 4 



Inlerferofnetric channel 2 and photometry A I ntetfero metric channel 2 and photometnc A - detrended 




Figure B.IO: Case 4. Cross-correlation functions for interferometric 12 and 
photometric PA channels. Left, raw data; right, linear trend subtracted. 



I ntarlero metric channels 2 and photometry B Inteiletomatnc channel 2 and photometnc B - detrended 




Figure B.ll: Case 4. Cross-correlation functions for interferometric 12 and 
photometric PB channels. Left, raw data; right, linear trend subtracted. 
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B.2 Statistical analysis in the frequency do- 
main 



Photometric channel B - PSD 



Interferomelnc channel 2 - PSD 
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Figure B.12: Case 1: Power Spectral Density functions for photometric 
input PB (left) and interferometric output 12 (right). 




Figure B.13: Case 2: Power Spectral Density functions for photometric 
input PB: raw data (left) and detrended (right). 



186 
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Interferomelric channel 2 - PSD Interferoinelnc channel 2 - detrended - PSD 
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Figure B.14: Case 2: Power Spectral Density functions for interferometric 
output 12: raw data (left) and detrended (right). 




Figure B.15: Case 3: Power Spectral Density functions for interferometric 
input II (first row) and 12 (second row): raw data (left) and detrended 
(right). 
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Figure B.16: Case 3: Power Spectral Density functions for photometric 
input PA (first row) and PB (second row): raw data (left) and detrended 
(right). 
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Graphics omitted in chapter 4 



Photometry B - PSD Photomeltic channel B - delrended - PSD 
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Figure B.17: Case 4: Power Spectral Density functions for photometric 
input PB (first row) and for interferometric output 12 (second row): raw 
data (left) and detrended (right). 
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B.3 Allan variance 



White noise random process and Pholomet^A White noise random process and Pholomet^B 




10° 10^ 10^ 10° 10° 10^ 10° 10° 

Figure B.18: Case 2. Allan variance comparison between a realization 
of a gaussian white noise (red) and ten records (blue) of the photometric 
channels (first row) and the interferometric ones (second row) for case 3. 
Flux is injected in channel PB (first row, left), while PA is void (first row, 
right). 
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